《计算机应用研究》|Application Research of Computers

深度学习相关研究综述

Review of deep learning

免费全文下载 (已被下载 次)  
获取PDF全文
作者 张军阳,王慧丽,郭阳,扈啸
机构 国防科技大学 计算机学院,长沙 410073
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)07-1921-08
DOI 10.3969/j.issn.1001-3695.2018.07.001
摘要 为了能够及时跟踪深度学习技术的最新研究进展,把握深度学习技术当前的研究热点和方向,针对深度学习技术的相关研究内容进行综述。首先介绍了深度学习技术的应用背景、应用领域,指出研究深度学习技术的重要性,以及当前重要的几种神经网络模型及两种常用大规模模型训练并行方案,其目的在于从本质上理解深度学习的模型架构及其优化技巧。对比分析了当下主流的深度学习软件工具和相关的工业界研究平台,旨在为神经网络模型的实际使用提供借鉴;详细介绍了当下几种主流的深度学习硬件加速技术和最新研究现状,并对未来研究方向进行了展望。
关键词 深度学习;神经网络;算法模型;软件工具;硬件加速
基金项目 国家自然科学基金资助项目(61572025)
国家重点研发计划资助项目(2016YFB0200401)
本文URL http://www.arocmag.com/article/01-2018-07-001.html
英文标题 Review of deep learning
作者英文名 Zhang Junyang, Wang Huili, Guo Yang, Hu Xiao
机构英文名 CollegeofComputer,NationalUniversityofDefenseTechnology,Changsha410073,China
英文摘要 In order to keep track of the latest research progress of deep learning technology and grasp the current research hotspot and direction of deep learning, this paper reviewed the related research contents of deep learning technology. Firstly, it introduced the application background and application field of deep learning technology and pointed out the importance of studying on deep learning technology. Secondly, it introduced several important neural network models and two kinds of commonly used large-scale model training parallel scheme, which aimed to understand the deep learning model structure and its optimization skills. Then it analyzed the current mainstream learning tools and related industrial research platform, which aimed to provide reference for the practical use of neural network model. At the end, this paper introduced the hardware acce-leration technology and the latest research status of several kinds of deep learning hardware acceleration in detail, and also discussed the future research directions.
英文关键词 deep learning; neural network; algorithm model; software tools; hardware acceleration
参考文献 查看稿件参考文献
  [1] Yang Jianchao, Yu Kai, Gong Yihong, et al. Linear spatial pyramid matching using sparse coding for image classification[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2009:1794-1801.
[2] Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J] . Science, 2006, 313(5786):504-507.
[3] LeCun Y, Boser B, Denker J S, et al. Backpropagation applied to handwritten zip code recognition[J] . Neural Computation, 1989, 1(4):541-551.
[4] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J] . Journal of Machine Learning Research, 2003, 3(2):1137-1155.
[5] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from Scratch[J] . Journal of Machine Learning Research, 2011, 12(2):2493-2537.
[6] Mikolov T, Deoras A, Kombrink S, et al. Empirical evaluation and combination of advanced language modeling techniques[C] //Proc of the 12th Annual Conference on International Speech Communication Association. 2011:605-608.
[7] Schwenk H, Rousseau A, Attik M. Large, pruned or continuous space language models on a GPU for statistical machine translation[C] //Proc of NAACL-HLT 2012 Workshop:Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT. Stroudsburg:Association for Computational Linguistics, 2012:11-19.
[8] Socher R, Huval B, Manning C D, et al. Semantic compositionality through recursivematrix-vector spaces[C] //Proc of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg:Association for Computational Linguistics, 2012:1201-1211.
[9] Le Q V, Mikolov T. Distributed representations of sentences and documents[C] //Proc of the 31st International Conference on Machine Learning. 2014:1188-1196.
[10] Chen Yahui. Convolutional neural network forsentence classification[EB/OL] . (2015). https://uwspace. uwaterloo. ca/handle/10012/9592.
[11] Bourlard H, Kamp Y. Auto-association by multilayer perceptrons and singular value decomposition[J] . Biological Cybernetics, 1988, 59(4-5):291-294.
[12] Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders[C] //Proc of the 25th International Conference on Machine Learning. New York:ACM Press, 2008:1096-1103.
[13] Le Q V, Ngiam J, Coates A, et al. On optimization methods for deep learning[C] //Proc of the 28th International Conference on Machine Learning. 2011:265-272.
[14] Zou W Y, Ng A Y, Yu Kai. Unsupervised learning of visual invariance with temporal coherence[EB/OL] . (2012-08-17). http://ai. stanford. edu/~wzou/nipswshop_ZouNgYu11. pdf.
[15] Rifai S, Vincent P, Muller X, et al. Contractive auto-encoders:explicit invariance during feature extraction[C] //Proc of the 28th International Conference on Machine Learning. 2011:833-840.
[16] Krizhevsky A, Hinton G E. Using very deep autoencoders for content-based imageretrieval[C] //Proc of the 19th European Symposium on Artificial Neural Networks. 2011:489-494.
[17] Zhang Jie, Shan Shiguang, Kan Meina, et al. Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment[C] //Proc of European Conference on Computer Vision. [S. l. ] :Springer International Publishing, 2014:1-16.
[18] Hinton G E. A practical guide to training restricted Boltzmann machines[M] //Neural Networks:Tricks of the Trade. Berlin:Springer, 2012:599-619.
[19] Hinton G E. Training products of experts by minimizing contrastive divergence[J] . Neural Computation, 2002, 14(8):1771-1800.
[20] Ackley D H, Hinton G E, Sejnowski T J. A learning algorithm for Boltzmann machines[J] . Cognitive Science, 1985, 9(1):147-169.
[21] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C] //Proc of the 13th International Conference on Artificial Intelligence and Statistics. 2010:249-256.
[22] 赵元庆, 吴华. 多尺度特征和神经网络相融合的手写体数字识别[J] . 计算机科学, 2013, 40(8):316-318.
[23] Ranzato M A, Hinton G E. Modeling pixel means and covariances using factorized third-order Boltzmann machines[C] //Proc of Computer Vision and Pattern Recognition Conference. Washington DC:IEEE Computer Society, 2010:2551-2558.
[24] Courville A, Bergstra J, Bengio Y. A spikeand slab restricted Boltzmann machine[C] //Proc of the 14th International Conference on Artificial Intelligence and Statistics. 2011:233-241.
[25] Memisevic R, Hinton G E. Unsupervised learning of image transformations[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Washington DC:IEEE Computer Society, 2007:1-8.
[26] Larochelle H, Bengio Y. Classification using discriminative restricted Boltzmann machines[C] //Proc of the 25th International Conference on Machine Learning. New York:ACM Press, 2008:536-543.
[27] 孙志军, 薛磊, 许阳明. 基于深度学习的边际Fisher分析特征提取算法[J] . 电子与信息学报, 2013, 35(4):805-811.
[28] Salakhutdinov R, Hinton G E. Deep Boltzmann machines[C] //Proc of the 12th International Conference on Artificial Intelligence and Statistics. 2009:448-455.
[29] Lee H, Grosse R, Ranganath R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations[C] //Proc of the 26th Annual International Conference on Machine Learning. New York:ACM Press, 2009:609-616.
[30] Yu Kai, Lin Yuanqing, Lafferty J. Learning image representations from the pixel level via hierarchical sparse coding[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Washington DC:IEEE Computer Society, 2011:1713-1720.
[31] Zeiler M D, Taylor G W, Fergus R. Adaptive deconvolutional networks for mid and high level feature learning[C] //Proc of IEEE International Conference on Computer Vision. Piscataway, NJ:IEEE Press, 2011:2018-2025.
[32] Russakovsky O, Deng Jia, Su Hao, et al. ImageNet large scale visual recognition challenge[J] . International Journal of Computer Vision, 2015, 115(3):211-252.
[33] MIT Technology Review. The 10 breakthrough technologies of 2013[EB/OL] . (2013-04-23). https://www. technologyreview. com/s/513981/the-10-breakthrough-technologies-of-2013/.
[34] Boureau Y L, Ponce J, LeCun Y. A theoretical analysis of feature pooling in visual recognition[C] //Proc of the 27th International Conference on Machine Learning. 2010:111-118.
[35] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[EB/OL] . (2012-06-03). https://arxiv. org/pdf/1207. 0580v1. pdf.
[36] Wan Li, Zeiler M D, Zhang Sixin, et al. Regularization of neural networks using drop connect[C] //Proc of the 30th International Conference on Machine Learning. 2013:1058-1066.
[37] Graves A. Supervised sequence labelling with recurrent neural networks[M] . [S. l. ] :Springer International Publishing, 2012.
[38] Kiperwasser E, Goldberg Y. Simple and accurate dependency parsing using bidirectional LSTM feature representations[C] // Transactions of the Association for Computational Linguistics. 2016:313-327.
[39] Graves A, Mohamed A R, Hinton G E. Speech recognition with deep recurrent neural networks[C] //Proc of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013:6645-6649.
[40] Jaeger H. Echo state networks[J] . Scholarpedia, 2007, 2(9):2330.
[41] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL] . 2014. https://arXiv. org/abs/1406. 1078v3.
[42] Koutník J, Greff K, Gomez F, et al. A clockwork RNN[C] //Proc of the 31st International Conference on Machine Learning. 2014:1863-1871.
[43] Karpathy A, Li Feifei. Deep visual-semantic alignments for generating image descriptions[J] . IEEE Trans on Pattern Analysis and Machine Intelligence, 2017, 39(4):664-676.
[44] Dean J, Corrado G S, Monga R, et al. Large scale distributed deep networks[C] //Proc of the 25th International Conference on Neural Information Processing Systems. [S. l. ] :Curran Associates Inc, 2012:1223-1231.
[45] Yadan O, Adams K, Taigman Y, et al. Multi-GPU training of ConvNets[EB/OL] . 2014. https://arXiv. org/abs/1312. 5853.
[46] Yu Kai. Large-scale deep learning at Baidu[C] //Proc of the 22nd ACM ACM International Conference on Information & Knowledge Management. New York:ACM Press, 2013:2211-2212.
[47] 腾讯大讲堂. 深度学习在腾讯的平台化和应用实践[EB/OL] . (2014-08-04). http://djt. qq. com/article/view/1231.
[48] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convoluteonal neural networks[C] //Proc of the 25th International Conference on Neural Information Processing Systems. [S. l. ] :Curran Associates Inc, 2012:1097-1105.
[49] DeHon A, Adams J, DeLorimier M, et al. Design patterns for reconfigurable computing[C] //Proc of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. Washington DC:IEEE Computer Society, 2004:13-23.
[50] Chen Tianshi, Du Zidong, Sun Ninghui, et al. DianNao:a small-footprint high-throughput accelerator for ubiquitous machine-learning[J] . ACM SIGARCH Computer Architecture News, 2014, 42(1):269-284.
[51] Chen Yunji, Luo Tao, Liu Shaoli, et al. DaDianNao:a machine-learning supercomputer[C] //Proc of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. Washington DC:IEEE Computer Society, 2014:609-622.
[52] Liu Daofu, Chen Tianshi, Liu Shaoli, et al. PuDianNao:a polyvalent machine learning accelerator[J] . ACM SIGARCH Computer Architecture News, 2015, 43(1):369-381.
[53] Du Zidong, Fasthuber R, Chen Tianshi, et al. ShiDianNao:shifting vision processing closer to the sensor[J] . ACM SIGARCH Computer Architecture News, 2015, 43(3):92-104.
[54] Liu Shaoli, Du Zidong, Tao Jinhua, et al. Cambricon:an instruction set architecture for neural networks[J] . ACM SIGARCH Computer Architecture News, 2016, 44(3):393-405.
[55] Esser S K, Merolla P A, Arthur J V, et al. Convolutional networks for fast, energy-efficient neuromorphic computing[J] . Proceedings of the National Academy of Sciences of the USA, 2016, 113(41):11441-11446.
[56] Chi Ping, Li Shuangchen, Xu Cong, et al. PRIME:a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory[C] //Proc of the 43rd ACM/IEEE International Symposium on Computer Architecture. Washington DC:IEEE Computer Society, 2016:27-39.
[57] Kim D, Kung J, Chai S, et al. Neurocube:a programmable digital neuromorphic architecture with high-density 3D memory[C] //Proc of the 43rd ACM/IEEE International Symposium on Computer Architecture. Washington DC:IEEE Computer Society, 2016:380-392.
[58] Alwani M, Chen Han, Ferdman M, et al. Fused-layer CNN accelerators[C] //Proc of the 49th Annual IEEE/ACM International Symposium on Microarchitecture. Washington DC:IEEE Computer Society, 2016:1-12.
[59] Weiss K, Khoshgoftaar T M, Wang Dingding. A survey of transfer learning[J] . Journal of Big Data, 2016, 3(12):9.
[60] 庄福振, 罗平, 何清, 等. 迁移学习研究进展[J] . 软件学报, 2015, 26(1):26-39.
[61] 周志华, 杨强. 机器学习及其应用[M] . 北京:清华大学出版社, 2011.
收稿日期 2017/5/9
修回日期 2017/6/19
页码 1921-1928,1936
中图分类号 TP181
文献标志码 A