《计算机应用研究》|Application Research of Computers

基于一致性正则化的在线知识蒸馏

OKDCR: online knowledge distillation via consistency regularization

免费全文下载 (已被下载 次)  
获取PDF全文
作者 张晓冰,龚海刚,刘明
机构 电子科技大学 计算机科学与工程学院,成都 611731
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2021)11-007-3249-05
DOI 10.19734/j.issn.1001-3695.2021.05.0139
摘要 在线知识蒸馏通过同时训练两个或多个模型的集合,并使之相互学习彼此的提取特征,从而实现模型性能的共同提高。已有方法侧重于模型间特征的直接对齐,从而忽略了决策边界特征的独特性和鲁棒性。利用一致性正则化来指导模型学习决策边界的判别性特征。具体地说,网络中每个模型由特征提取器和一对任务特定的分类器组成,通过正则化同一模型不同分类器间以及不同模型对应分类器间的分布距离来度量模型内和模型间的一致性,这两类一致性共同用于更新特征提取器和决策边界的特征。此外,模型内一致性将作为自适应权重,与每个模型的平均输出加权生成集成预测值,进而指导所有分类器与之相互学习。在多个公共数据集上,该算法均取得了较好的表现性能。
关键词 计算机视觉; 模型压缩; 在线知识蒸馏; 一致性正则化
基金项目 国家自然科学基金资助项目(61572113)
中央高校基金资助项目(XGBDFZ09)
本文URL http://www.arocmag.com/article/01-2021-11-007.html
英文标题 OKDCR: online knowledge distillation via consistency regularization
作者英文名 Zhang Xiaobing, Gong Haigang, Liu Ming
机构英文名 School of Computer Science & Engineering,University of Electronic Science & Technology of China,Chengdu 611731,China
英文摘要 Online knowledge distillation trains two or more peer models simultaneously to improve the model performance via learning the extracted features from each other collaboratively. Existing methods focus on aligning the features directly but neglect to learn distinctive and robust features around the decision boundary. This paper utilized consistency regularization to guide the model to learn the discriminant features effectively. Specifically, it equipped each model with a feature extractor and a pair of task-specific classifiers. It measured intra-model consistency for each model by the distribution distance between the two classifiers of the model, and evaluated an inter-model consistency based on the classifier distributions across models. The two types of consistency guided to update the shared feature extractors and regularized the feature learning around the decision boundary. In addition, the intra-model consistency generated adaptive weights as the mean prediction of each model in the final model ensemble, and the weighted ensemble guided all classifiers to learn for joint alignment of peer models. The proposed method achieves superior classification performance consistently on multiple public datasets.
英文关键词 computer vision; model compression; online knowledge distillation; consistency regularization
参考文献 查看稿件参考文献
 
收稿日期 2021/5/12
修回日期 2021/6/24
页码 3249-3253
中图分类号 TP183
文献标志码 A