《计算机应用研究》|Application Research of Computers

3D-ACC:基于3D集成电路的卷积神经网络加速结构研究

3D-ACC:convolution neural network accelerator based on 3D integrated circuits

免费全文下载 (已被下载 次)  
获取PDF全文
作者 王吉军,郝子宇,李宏亮
机构 江南计算技术研究所,江苏 无锡 214083
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)12-030-3671-06
DOI 10.19734/j.issn.1001-3695.2019.08.0558
摘要 在深亚微米工艺下,通过持续增大芯片规模来提升计算能力,会导致芯片工作频率降低、功耗剧增、计算效率下降等问题。因此,利用3D集成电路技术,提出并量化研究了一种将二维脉动阵列映射到3D集成电路上的卷积神经网络加速器3D-ACC,并设计了一种高效的卷积映射计算方法,构建了其性能模型,量化分析了不同设计参数对3D-ACC性能和效率的影响。实验结果表明,当采用四层64×64脉动阵列的堆叠结构时,3D-ACC的峰值计算性能达32 TFLOPS,测试VGG-16、ResNet-50以及Inception V3模型时的实际计算效率可达47.4%、37.9%及40.9%。与相同计算单元规模的二维加速器2D-ACC相比,3D-ACC的计算效率及性能优势明显,实际计算性能分别是后者的1.51、1.69以及1.61倍。探索了3D集成电路在神经网络加速器设计的优势,对进一步提升神经网络加速器性能具有一定参考价值。
关键词 3D集成电路; 脉动阵列; 循环分块; 性能模型
基金项目 国家“核高基”专项基金资助项目(2018ZX01028-102)
本文URL http://www.arocmag.com/article/01-2020-12-030.html
英文标题 3D-ACC:convolution neural network accelerator based on 3D integrated circuits
作者英文名 Wang Jijun, Hao Ziyu, Li Hongliang
机构英文名 Jiangnan Institute of Computing Technology,Wuxi Jiangsu 214083,China
英文摘要 In the deep submicron process, increasing chip's integrated scale to improve performance will lead to the decrease of the chip frequency, the sharp increase of power consumption and the decrease of the computational efficiency. Therefore, by using the new technology of 3D integrated circuit, this paper proposed and quantitatively studied a convolution neural network accelerator named 3D-ACC that mapped 2D systolic array onto 3D integrated circuit. Firstly, aiming at 3D-ACC, this paper designed an efficient convolutional mapping algorithm and built its performance model based on related design parameters. Then this paper analyzed the effects of different design parameters on the performance and efficiency of 3D-ACC. The experimental results show that the peak performance of 3D-ACC can achieve up to 32 TFLOPS when adopt the stack structure of 4-layer with 64×64 systolic array, and the actual computational efficiency of 3D-ACC can reach 47.4%, 37.9% and 40.9% when tested with VGG-16, ResNet-50 and Inception V3 model respectively. The computational efficiency of 3D-ACC is obviously superior to a 2D-ACC with the same amount of PEs, the actual computational performance is 1.51×, 1.69× and 1.61× than that of the latter. This paper explores some advantages of 3D integrated circuit in neural network accelerator design, which can be a reference for further improving the performance of neural network accelerator in the future.
英文关键词 3D integrated circuits; systolic array; loop tiling; performance model
参考文献 查看稿件参考文献
 
收稿日期 2019/8/17
修回日期 2019/10/9
页码 3671-3676,3680
中图分类号 TP392
文献标志码 A