《计算机应用研究》|Application Research of Computers

神经网络模型压缩方法综述

Survey of neural network model compression methods

免费全文下载 (已被下载 次)  
获取PDF全文
作者 曹文龙,芮建武,李敏
机构 1.中国科学院大学,北京 100190;2.中国科学院软件研究所,北京 100190;3.中国科学院通用芯片与基础软件研究中心,北京 100190
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)03-002-0649-08
DOI 10.19734/j.issn.1001-3695.2018.01.0061
摘要 由于存储空间和功耗的限制,神经网络模型在嵌入式设备上的存储和计算仍然是一个巨大的挑战。模型压缩作为一种有效的解决方法,受到了越来越多研究者的关注。针对卷积神经网络模型进行了研究,分析了模型中存在的冗余信息,并对国内外学者在神经网络模型压缩方面的研究成果整理,从参数剪枝,权重共享和权重矩阵分解等方面总结了神经网络压缩的主要方法。最后针对神经网络模型发展现状及目前面临的若干主要问题进行了讨论,指出了下一步的研究方向。
关键词 神经网络;模型压缩;矩阵分解;参数共享
基金项目
本文URL http://www.arocmag.com/article/01-2019-03-002.html
英文标题 Survey of neural network model compression methods
作者英文名 Cao Wenlong, Rui Jianwu, Li Min
机构英文名 1.UniversityofChineseAcademyofSciences,Beijing100190,China;2.InstituteofSoftware,ChineseAcademyofSciences,Beijing100190,China;3.GeneralChips&BasicSoftwareResearchCenter,ChineseAcademyofSciences,Beijing100190,China
英文摘要 However, limited by the memory space and power, it is a challenging task to deploy the deep neural network model in embedded system. As an effective solution, model compression has attracted tremendously attention for researchers. At first, this paper introduced the basic deep neural network model and analyzed the redundant computation in the model. Then, it presented the existing methods for model compression from the aspects of parameter pruning, parameter sharing and weight matrix decomposition. Finally, this paper discussed the potential challenges in deep neural network and direction of future research.
英文关键词 neural network; model compression; matrix decompression; parameter sharing
参考文献 查看稿件参考文献
  [1] Krizhevsky A, Ilya S, Hinton G. E. ImageNet classification with deep convolutional neural networks[C] //Advances in Neural Information Processing Systems. 2012:1-9.
[2] Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C] //Advances in Neural Information Processing Systems. 2015:1-10.
[3] Chellapilla K, Puri S, Simard P. High performance convolutional neural networks for document processing[C] //Proc of the 10th International Workshop on Frontiers in Handwriting Recognition. 2006:1-7.
[4] Han Song, Pool J, Tran J, et al. Learning both weights and connections for efficient neural networks[C] //Advances in Neural Information Processing Systems 28. 2015:1-9.
[5] Ciresan D C, Meier U, Masci J, et al. High-performance neural networks for visual object classification[EB/OL] . (2011-02-01). https://arxiv. org/abs/1102. 0183.
[6] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[EB/OL] . (2012-07-03). Https://arxiv. org/abs/1207. 0580.
[7] LeCun Y, Denker J S, Solla S A, et al. Optimal brain damage[C] //Proc of International Conference on Neural Information. 1989:598-605.
[8] Li Wan, Zelier M, Zhang Sixin, et al. Regularization of neural networks using drop connect[C] //Proc of International Conference on Machine Learning. 2013:1058-1066.
[9] Li Zhe, Gong Boqing, Yang Tianbao. Improved dropout for shallow and deep learning[C] //Advances in Neural Information Processing Systems 29. 2016:1-9.
[10] Lebedev V, Lempitsky V. Fast convnets using group-wise brain da-mage[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE Press, 2016:2554-2564.
[11] Mao Huizi, Han Song, Pool J, et al. Exploring the regularity of sparse structure in convolutional neural networks[EB/OL] . (2017-06-05). https://arxiv. org/abs/1705. 08922.
[12] Chen Wenlin, Wilson J T, Tyree S, et al. Compressing neural networks with the hashing trick[EB/OL] . (2015-04-19). https://arxiv. org/abs/1504. 04788.
[13] Zhang Xiangyu, Zou Jianhua, Xiang Ming, et al. Efficient and accurate approximations of nonlinear convolutional networks[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2015:1984-1992.
[14] Denil M, Shakibi B, Dinh L, et al. Predicting parameters in deep learning[C] //Proc of the International Conference on Neural Information Processing Systems. 2013:2148-2156.
[15] Gong Yunchao, Liu Liu, Yang Ming, et al. Compressing deep convolutional networks using vector quantization[EB/OL] . (2014-12-18). https://arxiv. org/abs/1412. 6115.
[16] Wu Jiaxiang, Leng Cong, Wang Yuhang, et al. Quantized convolutional neural networks for mobile devices[C] //Proc of IEEE Confe-rence on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:4820-4828.
[17] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[EB/OL] . (2015-03-09). https://arxiv. org/abs/1503. 02531.
[18] Denton E, Zaremba W, Bruna J, et al. Exploiting linear structure within convolutional networks for efficient evaluation[C] //Proc of International Conference on Neural Information. 2014:1269-1277.
[19] Yang Zichao, Moczulski M, Denil M, et al. Deep fried convnets[C] //Proc of IEEE International Conference on Computer Vision. Piscataway, NJ:IEEE Press, 2015:1476-1483.
[20] Lebedev V, Ganin Y, Rakhuba1 M, et al. Speeding-up convolutional neural networks using fine-tuned CP-decomposition[C] //Proc of International Conference on Learning Representations. 2015:1-10.
[21] Jaderberg M, Vedaldi A, Zisserman A. Speeding up convolutional neural networks with low rank expansions[EB/OL] . (2014-05-15). https://arxiv. org/abs/1405. 3866.
[22] Szegedy C, Liu Wei, Jia Yangqing, et al. Going deeper with convolutions[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2015:1-9.
[23] Lin Min, Chen Qiang, Yan Shuicheng. Network in network[C] //Proc of the International Conference on Learning Representations. 2014:1-10.
[24] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. deep residual learning for image recognition[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:770-778.
[25] Kim K H, Hong S, Roh B, et al. PVANET:deep but lightweight neural networks for real-time object detection[EB/OL] . (2016-09-30). [2017-05-12] . https://arxiv. org/abs/1608. 08021v3.
[26] Lin Shaohui, Ji Rongrong, Guo Xiaowei, et al. Towards convolutio-nal neural networks compression via global error reconstruction[C] //Proc of the 25th International Joint Conference on Artificial Intelligence. New York:AAAI Press, 2016:1753-1759.
[27] Redmon J, Farhadi A. YOLO9000:better, faster, stronger[C] //Proc of IEEE Conference on Computer Vision and Pattern Recogni〓〓〓〓tion. Piscataway, NJ:IEEE Press, 2017:6517-6525.
[28] Courbariaux M, Hubara I, Soudry H, et al. Binarized neural networks:training deep neural networks with weights and activations constrained to +1 or-1[EB/OL] . (2016-03-17). [2017-04-10] . https://arxiv. org/abs/1602. 02830.
[29] Rastegari M, Ordonez V, Redmon J, et al. Xnor-net:imagenet classification using binary convolutional neural networks[C] //Proc of European Conference on Computer Vision. Berlin:Springer, 2016:525-542.
[30] Li Zefan, Ni Bingbing, Zhang Wenjun, et al. Performance guaranteed network acceleration via high-order residual quantization[C] //Proc of IEEE International Conference on Computer Vision. Piscata-way, NJ:IEEE Press, 2017:2603-2611.
[31] Lin K, Yang H, Hsiao J, et al. Deep learning of binary hash codes for fast image retrieval large-scale image search query[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. 2015:27-35.
[32] Mu Yadong, Liu Zhu. Deep hashing:a joint approach for image signature learning[EB/OL] . (2016-08-12). [2017-05-12] . https://arxiv. org/abs/1608. 03658.
[33] Han Song, Mao Huizi, Dally W J. Deep compression:compressing deep neural networks with pruning, trained quantization and Huffman coding[EB/OL] . (2016-02-15). https://arxiv. org/abs/1510. 00149.
[34] Girshick R. Fast R-CNN[C] //Proc of IEEE International Conference on Computer Vision. Piscataway, NJ:IEEE Press, 2015:1440-1448.
[35] Liu Hong, Ji Rongrong, Wu Yongjian, et al. Supervised matrix factorization for cross-modality hashing[C] //Proc of International Joint Conference on Artificail Intelligent. 2016:1767-1773.
收稿日期 2018/1/12
修回日期 2018/3/13
页码 649-656
中图分类号 TP183
文献标志码 A