《计算机应用研究》|Application Research of Computers

基于并行C4.5的铁路零散白货客户流失预测研究

Research on railway scattered freight customer churn prediction based on parallel C4.5 decision tree algorithm

免费全文下载 (已被下载 次)  
获取PDF全文
作者 张斌,彭其渊,刘帆洨
机构 西南交通大学 交通运输与物流学院,成都 610031
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)03-037-0829-04
DOI 10.19734/j.issn.1001-3695.2017.09.0912
摘要 为了提高铁路零散白货客户流失预测的准确性和高效性,根据铁路零散白货客户的流失特征,提出了基于CDL模型的客户流失识别方法;在此基础上,针对数据量大的问题,提出了基于Hadoop并行框架的C4.5决策树客户流失预测模型。通过仿真实验证明,该模型具有较好的准确性和预测能力,并且随着样本数量的增加,Hadoop并行框架的效率得到了明显的提升,且不影响客户流失预测模型的准确性和预测能力。
关键词 铁路运输;零散白货;客户流失;C4.5决策树;并行;Hadoop
基金项目 中国铁路总公司科研计划重大课题(2016X008-J)
本文URL http://www.arocmag.com/article/01-2019-03-037.html
英文标题 Research on railway scattered freight customer churn prediction based on parallel C4.5 decision tree algorithm
作者英文名 Zhang Bin, Peng Qiyuan, Liu Fanxiao
机构英文名 SchoolofTransportation&Logistics,SouthwestJiaotongUniversity,Chengdu610031,China
英文摘要 In order to improve the accuracy and efficiency of customer churn prediction of railway scattered freight, according to the loss characteristics of railway scattered freight customers, this paper proposed a customer churn identification method based on CDL model.On this basis, facing the problem of big data, it proposed a C4.5 decision tree customer churn prediction model based on Hadoop parallel framework.Simulation results show that the model has good accuracy and predictive ability, and as the number of samples increases, obviously improves the efficiency of Hadoop parallel framework, and doesn’t affect the accuracy and prediction ability of churn prediction model.
英文关键词 railway transportation; scattered freight; customer churn; C4.5 decision tree; parallel; Hadoop
参考文献 查看稿件参考文献
  [1] 王志美, 张星臣, 徐彬. 零散白货的货源组织问题和运输组织问题一体化[J] . 北京交通大学学报, 2016, 40(6):43-49, 56. (Wang Zhimei, Zhang Xingchen, Xu Bin. Railway freight train schedule problem and loading plan for high-value scattered freight[J] . Journal of Beijing Jiaotong University, 2016, 40(6):43-49, 56. )
[2] 张伯敏. 供给侧改革下铁路从大宗货物向零散快捷货物拓展的思考[J] . 交通运输工程与信息学报, 2016, 14(4):36-40. (Zhang Bomin. Thinking on the railway freight expansion from bulk stock to scattered and fast cargoes under the reform of supply side[J] . Journal of Transportation Engineering and Information, 2016, 14(4):36-40. )
[3] 周新军. 客户关系管理引入铁路货运服务的理论与实践[J] . 铁道货运, 2008(12):25-28. (Zhou Xinjun. Theory and practice on appling customer relationship management on railway freight service[J] . Railway Freight Transport, 2008(12):25-28. )
[4] Athanassopoulos A D. Customer satisfaction cues to support market segmentation and explain switching behavior[J] . Journal of Business Research, 2000, 47(3):191-207.
[5] Bhattacharya C B. When customers are members:customer retention in paid membership contexts[J] . Journal of the Academy of Marketing Science, 1998, 26(1):31-44.
[6] 夏国恩, 金炜东. 基于支持向量机的客户流失预测模型[J] . 系统工程理论与实践, 2008, 28(1):71-77. (Xia Guoen, Jin Weidong. Model of customer churn prediction on support vector machine[J] . System Engineering Theory and Practice, 2008, 28(1):71-77. )
[7] Chang Chengchang, Gong D C. A comparison of RoHS risk assessment using the logistic regression model and artificial neural network model[C] //Proc of the 9th International Conference on Machine Learning and Cybernetics. Piscataway, NJ:IEEE Press, 2010:1396-1401.
[8] 余路. 电信客户流失的组合预测模型[J] . 华侨大学学报:自然科学版, 2016, 37(5):637-640. (Yu Lu. Combination forecasting model of customer churns in Telecom industry[J] . Journal of Huaqiao University:Natural Science, 2016, 37(5):637-640. )
[9] 叶志龙, 黄章树. 线上会员客户流失的建模与预测研究[J] . 管理现代化, 2016, 36(3):96-98. (Ye Zhilong, Huang Zhangshu. Research on modeling and prediction of customer churn in online membership[J] . Modernization of Management, 2016, 36(3):96-98. )
[10] 于小兵, 卢逸群. 电子商务客户流失预警与预测[J] . 系统工程, 2016, 34(9):37-43. (Yu Xiaobing, Lu Yiqun. E-commerce customer churn warning system and prediction model[J] . Systems Engineering, 2016, 34(9):37-43. )
[11] 张宇, 张之明. 一种基于C5. 0决策树的客户流失预测模型研究[J] . 统计与信息论坛, 2015, 30(1):89-94. (Zhang Yu, Zhang Zhiming. A customer churn alarm model based on the C5. 0 decision tree-taking the postal short message as an example[J] . Statistics & Information Forum, 2015, 30(1):89-94. )
[12] Quinlan J R. C4. 5:programs for machine learning[M] . San Francisco:Morgan Kaufmann Publishers, 1993:17-42.
[13] Quinlan J R. Induction of decision trees[J] . Machine Learning, 1986, 1(1):81-106.
[14] 黄刚, 孙媛. 基于Hadoop平台的SPRINT算法的分析与研究[J] . 南京师大学报:自然科学版, 2016, 39(4):25-30. (Huang Gang, Sun Yuan. Analysis and study of SPRINT algorithm based on Hadoop platform[J] . Journal of Nanjing Normal University:Natural Science Edition, 2016, 39(4):25-30. )
[15] 陈湘涛, 张超, 韩茜. 基于Hadoop的并行共享决策树挖掘算法研究[J] . 计算机科学, 2013, 40(11):215-221. (Chen Xiangtao, Zhang Chao, Han Qian. Research on parallel shared decision tree algorithm based on Hadoop[J] . Computer Science, 2013, 40(11):215-221. )
[16] 刘亚秋, 李海涛, 景维鹏. 基于Hadoop的海量嘈杂数据决策树算法的实现[J] . 计算机应用, 2015, 35(4):1143-1147. (Liu Yaqiu, Li Haitao, Jing Weipeng. Implementation of decision tree algorithm dealing with massive noisy data based on Hadoop[J] . Journal of Computer Applications, 2015, 35(4):1143-1147. )
[17] 张晶星, 李石君. 基于Hadoop的改进决策树剪枝算法[J] . 计算机工程与设计, 2016, 37(7):1942-1946. (Zhang Jingxing, Li Shijun. Decision tree pruning algorithm based on Hadoop[J] . Computer Engineering and Design, 2016, 37(7):1942-1946. )
[18] 贺本岚. 支持向量机模型在银行客户流失预测中的应用研究[J] . 金融论坛, 2014, 19(9):70-74. (He Benlan. A study of the application of SVM in prediction about decrease in bank’s customers[J] . Finance Forum, 2014, 19(9):70-74. )
[19] 陆秋, 程小辉. 基于MapReduce的决策树算法并行化[J] . 计算机应用, 2012, 32(9):2463-2465, 2469. (Lu Qiu, Cheng Xiaohui. Parallelization of decision tree algorithm based on MapReduce[J] . Journal of Computer Applications, 2012, 32(9):2463-2465, 2469. )
收稿日期 2017/9/6
修回日期 2017/10/17
页码 829-832,837
中图分类号 TP391
文献标志码 A