《计算机应用研究》|Application Research of Computers

基于数据复杂度的投毒数据检测方法

Method for detecting poisoning data based on data complexity

免费全文下载 (已被下载 次)  
获取PDF全文
作者 亢飞,李建彬
机构 中南大学 a.信息科学与工程学院;b.信息安全与大数据研究院,长沙 410083
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)07-045-2140-04
DOI 10.19734/j.issn.1001-3695.2018.12.0940
摘要 针对机器学习模型训练过程中攻击者可以利用修改原始训练数据生成投毒数据的方式对机器学习模型进行投毒攻击的问题,提出一种基于数据复杂度的投毒数据检测方法。该方法在正常数据集的基础上,应用梯度上升策略对正常数据集内的样本实例进行自我投毒,通过挖掘自我投毒产生的投毒数据对正常数据集数据复杂度的影响,训练能够辨别投毒数据的检测模型。该方法在选定应用场景中的检测准确率比现有方法有更好的效果。实验结果表明,投毒数据能够有效降低机器学习模型预测能力,应用基于数据复杂度的检测方法能够有效检测投毒数据,降低投毒数据对模型预测能力的不良影响。
关键词 机器学习; 投毒攻击; 梯度上升; 数据复杂度
基金项目
本文URL http://www.arocmag.com/article/01-2020-07-045.html
英文标题 Method for detecting poisoning data based on data complexity
作者英文名 Kang Fei, Li Jianbin
机构英文名 a.School of Information Science & Engineering,b.Information Security & Big Data Research Institute,Central South University,Changsha 410083,China
英文摘要 Aiming at the problem that the attacker can modify original training data to generate poisoned data to poison the machine learning model in the process of training the model, this paper proposed a poisoned data detection method based on data complexity. On the basis of the normal data set, the method poisoned the sample instances in the normal data set based on a direct gradient ascent strategy, and exploited the influence of the poisoned data on the data complexity of the normal data set to build a detection model that could identify the poisoned data. The detection accuracy of this method in selected application scenarios was better than the existing method. The experimental results show that the poisoned data can effectively reduce the predictive ability of the machine learning model, and the application of the method based on data complexity can effectively detect the poisoning data and reduce the adverse effects of the poisoned data on the model prediction ability.
英文关键词 machine learning; poisoning attack; gradient ascent; data complexity
参考文献 查看稿件参考文献
 
收稿日期 2018/12/6
修回日期 2019/3/4
页码 2140-2143
中图分类号 TP309.2
文献标志码 A