《计算机应用研究》|Application Research of Computers

一种改进过采样算法在类别不平衡信用评分中的应用

Application of improved oversampling algorithm in class-imbalance credit scoring

免费全文下载 (已被下载 次)  
获取PDF全文
作者 邵良杉,周玉
机构 辽宁工程技术大学 系统工程研究所,辽宁 葫芦岛 125105
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)06-018-1683-05
DOI 10.19734/j.issn.1001-3695.2017.12.0798
摘要 针对信贷行业信用评分业务中存在的样本类别不平衡问题,首先在信用评分各影响因素Fisher比率值分析的基础上确定主要评判指标;而后以基于支持度的过采样算法(SDSMOTE)为样例合成算法,支持向量机(SVM)为基预测器,Boosting算法为框架,构建基于Fisher-SDSMOTE-ESBoostSVM的类别不平衡信用评分预测模型;在基分类器训练结束后引入淘汰策略,删除未被正确分类的合成样例,重新生成正类样例并修正样例权重;最后以UCI数据库中德国信用数据集为实验样本,<i>F</i>-measure值和G-mean值为评价指标,对比分析Fisher-SDSMOTE-ESBoostSVM与其他集成学习算法的预测结果。实验结果表明,Fisher-SDSMOTE-ESBoostSVM算法应用到信贷行业客户信用评分预测中具有可行性和适应性,且预测准确率较高,具有一定的实际应用价值。
关键词 信用评分; 类别不平衡; SDSMOTE算法; Fisher准则; 支持向量机; 集成学习
基金项目 国家自然科学基金资助项目(71371091)
辽宁省社会规划项目(L14BTJ004)
本文URL http://www.arocmag.com/article/01-2019-06-018.html
英文标题 Application of improved oversampling algorithm in class-imbalance credit scoring
作者英文名 Shao Liangshan, Zhou Yu
机构英文名 System Engineering Institute,Liaoning Technical University,Huludao Liaoning 125105,China
英文摘要 In view of class-imbalance in real credit scoring business of credit industry, this paper firstly determined the main evaluation indicators of credit scoring based on a comprehensive analysis of the influence factors' Fisher ratio value. Then, it chose the SMOTE based on support degree(SDSMOTE) oversampling algorithm to synthesize new samples, SVM played as the base predictor and Boosting algorithm as the framework, this paper proposed a credit scoring prediction model which associated class-imbalance with Fisher-SDSMOTE-ESBoostSVM theory. Besides, it introduced the elimination strategy to delete the synthetic sample which was not classified accurately, after that synthesized the new positive class sample again and modified the sample weight. Finally, it selected the German credit dataset in the UCI database as the experimental dataset, and <i>F</i>-measure value and G-mean value as evaluation standard, comparing and analyzing the prediction result of Fisher-SDSMOTE-ESBoostSVM model and others ensemble learning algorithm. Experimental results show that the application of Fisher-SDSMOTE-ESBoostSVM algorithm to customer credit score prediction is feasible and applicable, and show a high level of accuracy, which proved that the algorithm has a certain practical application value.
英文关键词 credit scoring; class-imbalance; SDSMOTE algorithm; Fisher criterion; support vector machine; ensemble learning
参考文献 查看稿件参考文献
 
收稿日期 2017/12/7
修回日期 2018/1/22
页码 1683-1687
中图分类号 TP391
文献标志码 A