随着移动通信技术的飞速发展,电信用户群体数量不断的攀升,也越要求运营商重视用户使用体验,不断提升网络使用满意度。本文基于2022年北京移动提供的客户语音和上网业务数据,首先使用灰色关联分析筛选出满意度重要影响因素,然后采用Stacking多模型融合策略,结合使用随机森林、逻辑回归、K近邻、Adaboost、XGBoost共5种算法,对客户满意度打分进行预测研究,融合后模型在语音业务数据中预测准确率为0.613,在上网业务数据中预测准确率为0.606,为电信用户满意度评分预测和分析研究提供了一定的理论参考。With the rapid development of mobile communication technology, the number of telecommunication user groups continues to increase, and the more operators are required to pay attention to user experience and continuously improve the satisfaction of network usage. This paper is based on the customer voice and Internet service data provided by Beijing Mobile in 2022, first we use grey correlation analysis to filter out the important factors affecting satisfaction, and then we adopt the stacking multi-model fusion strategy, combined with the use of Random Forest, Logistic Regression, K Nearest Neighbours, Adaboost and XGBoost in a total of five algorithms, to carry out prediction research on customer satisfaction scoring. The prediction accuracy of the model is 0.613 in voice service data and 0.606 in Internet service data, which provides certain theoretical reference for the prediction and analysis research of telecom customer satisfaction scoring.
随着信息化建设的迅速推进,电信市场趋于饱和,如何应对用户流失成为通信运营商亟待解决的问题。本文基于电信用户数据,对用户流失趋势进行了深入预测分析。首先,针对数据缺失进行了填补,并对特征进行编码和衍生,使用SMOTE与Tomek Link技术处理了数据不均衡问题。接着,本文使用随机森林、XGBoost、SVM、逻辑回归、AdaBoost和GBDT六种单一模型分别进行用户流失预测。为了提高预测的准确性和稳健性,本文采用了Stacking多模型融合的方式,模型对比结果表明,第二层模型选用SVM达到了最高的准确率(0.8645),各项指标均优于单一模型。研究证明,Stacking集成模型在用户流失预测中具有较高的有效性,并通过分析识别了影响用户流失的关键因素,为电信运营商提供了减少客户流失的针对性建议,进而提升企业收益和利润。With the rapid advancement of information technology, the telecommunications market is becoming increasingly saturated, making customer churn a critical issue that telecom operators must address urgently. This paper conducts an in-depth predictive analysis of customer churn trends based on user data from Telecom. Initially, missing data was imputed, and feature encoding and derivation were performed. The SMOTE and Tomek Link techniques were employed to address the problem of data imbalance. Following this, six individual models—Random Forest, XGBoost, SVM, Logistic Regression, AdaBoost, and GBDT—were used to predict customer churn. To improve the accuracy and robustness of the predictions, this study applied the Stacking ensemble learning approach. The model comparison results indicate that the second-layer model using SVM achieved the highest accuracy (0.8645), with performance metrics surpassing those of the individual models. The study demonstrates the effectiveness of the Stacking ensemble model in predicting customer churn and identifies the key factors influencing churn through detailed