《计算机应用研究》|Application Research of Computers

一种面向多类不平衡协议流量的改进AdaBoost.M2算法

Improved AdaBoost.M2 algorithm for multiclass imbalanced protocol traffic

免费全文下载 (已被下载 次)  
获取PDF全文
作者 张仁斌,张杰,吴佩
机构 合肥工业大学 计算机与信息学院,合肥 230009
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)06-055-1863-05
DOI 10.19734/j.issn.1001-3695.2018.01.0010
摘要 针对AdaBoost.M2算法在解决多类不平衡协议流量的分类问题时存在不足,提出一种适用于因特网协议流量多类不平衡分类的集成学习算法RBWS-ADAM2,本算法在AdaBoost.M2每次迭代过程中设计了基于权重的随机平衡重采样策略对训练数据进行预处理,该策略利用随机设置采样平衡点的重采样方式来更改多数类和少数类的样本数目占比,以构建多个具有差异性的训练集,并将样本权重作为样本筛选的依据,尽可能保留高权重样本,以加强对此类样本的学习。在国际公开的协议流量数据集上将RBWS-ADAM2算法与其他类似算法进行实验比较表明,相比于其他算法,该算法不仅对部分少数类的<i>F</i>-measure有较大提升,更有效提高了集成分类器的总体G-mean和总体平均<i>F</i>-measure,明显增强了集成分类器的整体性能。
关键词 流量分类; 集成学习算法; 多类不平衡; 泛化性能
基金项目
本文URL http://www.arocmag.com/article/01-2019-06-055.html
英文标题 Improved AdaBoost.M2 algorithm for multiclass imbalanced protocol traffic
作者英文名 Zhang Renbin, Zhang Jie, Wu Pei
机构英文名 School of Computer & Information,Hefei University of Technology,Hefei 230009,China
英文摘要 The existing AdaBoost. M2 algorithm are insufficient in protocol traffic multiclass imbalance to solve the problem. So this paper proposed an ensemble algorithm called RBWS-ADAM2 for the classification of multiclass Internet traffic. During each iteration of AdaBoost. M2, this algorithm preprocessed the training dataset by randomly balanced resampling, this strategy changed the number of majorities and minorities by randomly setting the sampling balance point to build multiple different training datasets. Moreover, this strategy took sample weight as the basis for sample screening to strengthen the learning of this kind of sample. The experimental comparison of RBWS-ADAM2 algorithm and other similar algorithms on the internationally published protocol traffic datasets shows that, compared to other algorithms, the proposed RBWS-ADAM2 algorithm not only improves the <i>F</i>-measure of most minorities, but increases the overall G-mean and the overall average <i>F</i>-measure effectively, and obviously enhances the overall performance of the ensemble classifier.
英文关键词 traffic classification; ensemble algorithm; multiclass imbalance; generalization performance
参考文献 查看稿件参考文献
 
收稿日期 2018/1/16
修回日期 2018/3/15
页码 1863-1867
中图分类号 TP391
文献标志码 A