《计算机应用研究》|Application Research of Computers

深度神经网络的压缩研究

Compression of deep neural networks

免费全文下载 (已被下载 次)  
获取PDF全文
作者 韩云飞,蒋同海,马玉鹏,徐春香,张睿
机构 1.中国科学院新疆理化技术研究所,乌鲁木齐 830011;2.新疆民族语音语言信息处理实验室,乌鲁木齐 830011;3.中国科学院大学,北京 100049
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)10-2894-04
DOI 10.3969/j.issn.1001-3695.2018.10.003
摘要 深度神经网络中过多的参数使得自身成为高度计算密集型和内存密集型的模型,这使得深度神经网络的应用不能轻易地移植到嵌入或移动设备上以解决特殊环境下的实际需求。为了解决该问题,提出了基于网络删减、参数共享两者结合的神经网络压缩方案。首先通过删减掉权重小于阈值的网络连接,保留其重要的连接;然后使用K-means聚类算法将删减后每层的参数进行聚类,每簇内的各个参数共享该簇的中心值作为其权重。实验在MNIST数据集上完成手写数字识别功能的LeNet-300-100网络和修改得到的LeNet-300-240-180-100网络分别压缩了9.5×和12.1×。基于网络删减、参数共享两者结合的神经网络压缩方案为未来在特殊环境下更丰富的基于深度神经网络的智能应用提供了可行方案。
关键词 神经网络;压缩;网络删减;参数共享
基金项目 中国科学院科技服务网络计划(STS计划)资助项目(KFJ-EW-STS-129)
中国科学院西部之光人才培养计划资助项目(XBBS201319)
中国科学院青年创新促进会资助项目
新疆维吾尔自治区引进高层次人才计划资助项目
本文URL http://www.arocmag.com/article/01-2018-10-003.html
英文标题 Compression of deep neural networks
作者英文名 Han Yunfei, Jiang Tonghai, Ma Yupeng, Xu Chunxiang, Zhang Rui
机构英文名 1.XinjiangTechnicalInstituteofPhysics&Chemistry,ChineseAcademyofSciences,Urumqi830011,China;2.XinjiangLaboratoryofMinoritySpeech&LanguageInformationProcessing,Urumqi830011,China;3.UniversityofChineseAcademyofSciences,Beijing100049,China
英文摘要 Over-parameterized deep neural networks are both computationally intensive and memory intensive, making them too difficult to deploy on embedded or mobile systems to deal with some actual needs in special environment. To address this problem, this paper proposed a neural networks compression method based on pruning and parameters sharing. First, it pruned the network connections which’s weight was smaller than the threshold, and retaining the important connections. Then, it clustered the parameters of each pruned layer, and the parameters of each cluster shared the cluster’s central value as their weights. In the course of experiment, LeNet-300-100 network and LeNet-300-240-180-100 network obtained by updating on prior both were trained to recognize handwritten digits on MNIST datasets, and they were compressed by 9.5× and 12.1× respectively with about 1% loss of accuracy. The neural networks compression method based on pruning and parameters sharing makes it possible for intelligent application of deep neural network in special environment in the further.
英文关键词 neural networks; compression; network pruning; parameters sharing
参考文献 查看稿件参考文献
  [1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C] //Proc of the 25th International Conference on in Neural Information Processing Systems. [S. l. ] :Curan Associates, 2012:1097-1105.
[2] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL] . (2015-04-10). http://org/abs/. 1409. 1556.
[3] Szegedy C, Liu Wei, Jia Yangqing, et al. Going deeper with convolutions[C] //Computer Vision and Pattern Recognition. 2015:1-9.
[4] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Washington DC:IEEE Computer Society, 2016:770-778.
[5] Dong Minghui, Yang Chenyu, Lu Yanfeng, et al. Mapping frames with DNN-HMM recognizer for non-parallel voice conversion[C] //Proc of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Picataway, NJ:IEEE Press, 2015:488-494.
[6] Dat T H, Dennis J, Ren Lengyi, et al. A comparative study of multi-channel processing methods for noisy automatic speech recognition in urban environments[C] //Proc of IEEE International Conference on Acoustics, Speech and Signal Processing. Picataway, NJ:IEEE Press, 2016:6465-6469.
[7] Collobert R, Weston J. A unified architecture for natural language processing:deep neural networks with multitask learning[C] //Proc of the 25th International Conference on Machine Learning. New York:ACM Press, 2008:160-167.
[8] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J] . Journal of Machine Learning Research, 2011, 12(11):2493-2537.
[9] LéCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J] . Proceedings of the IEEE, 1998, 86(11):2278-2324.
[10] Denil M, Shakibi B, Dinh L, et al. Predicting parameters in deep learning[C] //Proc of the 26th International Conference on Neural Information Processing Systems. [S. l. ] :Curran Associates, 2013:2148-2156.
[11] Kim Y D, Park E, Yoo S, et al. Compression of deep convolutional neural networks for fast and low power mobile applications[J/OL] . (2016-02-24). http://arxiv. org/abs/1511. 06530.
[12] Soulie G, Gripon V, Robert M, et al. Compression of deep neural networks on the fly[C] //Proc of International Conference on Artificial Neural Networks. Cham:Springer, 2015:153-160.
[13] Sau B B, Balasubramanian V N. Deep model compression:distilling knowledge from noisy teachers[J/OL] . (2016-10-30). https://arxiv. org/pdf/1610. 09650v1. pdf.
[14] Kadetotad D, Arunachalam S, Chakrabarti C, et al. Efficient memory compression in deep neural networks using coarse-grain sparsification for speech applications[C] //Proc of the 35th International Conference on Computer-Aided Design. New York:ACM Press, 2016:Article No 78.
[15] Vanhoucke V, Senior A, Mao M Z. Improving the speed of neural networks on CPUs[C] //Proc of Deep Learning & Unsupervised Feature Learning Workshop. 2011:1-8.
[16] Hwang K, Sung W. Fixed-point feedforward deep neural network design using weights+1, 0, and -1[C] //Proc of IEEE Workshop on Signal Processing Systems. Piscataway, NJ:IEEE Press, 2014:1-6.
[17] Chen Wenlin, Wilson J T, Tyree S, et al. Compressing neural networks with the hashing trick[C] //Proc of the 32nd International Conference on Machine Learning. 2015:2285-2294.
[18] Gong Yunchao, Liu Liu, Yang Ming, et al. Compressing deep convolutional networks using vector quantization[J/OL] . (2014-12-18). https://arxiv. org/abs/1412. 6115.
[19] Hanson S J, Pratt L Y. Comparing biases for minimal network construction with back-propagation[C] //Proc of the 1st International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 1989:177-185.
[20] LeCun Y, Denker J S, Solla S A. Optimal brain damage[C] //Advances in Neural Information Processing Systems. San Francisco:Morgan Kaufmann Publishers Inc, 1990:598-605.
[21] Hassibi B, Stork D G. Second order derivatives for network pruning:optimal brain surgeon[C] //Advances in Neural Information Processing Systems. San Francisco:Morgan Kaufmann Publishers Inc, 1992:164-171.
[22] Han Song, Pool J, Tran J, et al. Learning both weights and connections for efficient neural networks[C] //Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2015:1135-1143.
[23] Han Song, Mao Huizi, Dally W J. Deep compression:compressing deep neural networks with pruning, trained quantization and Huffman coding[J/OL] . (2016-02-15). https://arxiv. org/abs/1510. 00149.
[24] LéCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J] . Proceedings of the IEEE, 1998, 86(11):2278-2324.
[25] Sainath T N, Kingsbury B, Sindhwani V, et al. Low-rank matrix factorization for deep neural network training with high-dimensional output targets[C] //Proc of IEEE International Conference on Acoustics, Speech and Signal Processing. Picataway, NJ:IEEE Press, 2013:6655-6659.
[26] Denil M, Shakibi B, Dinh L, et al. Predicting parameters in deep learning[J/OL] . (2014-10-27). https://arxiv. org/abs/1306. 0543.
[27] Nakkiran P, Alvarez R, Prabhavalkar R, et al. Compressing deep neural networks using a rank-constrained topology[C] //Proc of the 16th Annual Conference of the International Speech Communication Association. 2015:1473-1477.
[28] Denton E, Zaremba W, Bruna J, et al. Exploiting linear structure within convolutional networks for efficient evaluation[C] //Proc of the 27th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2014:1269-1277.
收稿日期 2017/5/8
修回日期 2017/7/26
页码 2894-2897,2903
中图分类号 TP183
文献标志码 A