《计算机应用研究》|Application Research of Computers

轻量金字塔解码结构的单目深度估计网络

Monocular depth estimation based on light-weight pyramid decoder convolution neural network

免费全文下载 (已被下载 次)  
获取PDF全文
作者 贾瑞明,李彤,李阳,王一丁
机构 北方工业大学 信息学院,北京 100144
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2021)01-059-0293-05
DOI 10.19734/j.issn.1001-3695.2019.09.0580
摘要 针对单目深度估计网络庞大的参数量和计算量,提出一种轻量金字塔解码结构的单目深度估计网络,可以在保证估计精度的情况下降低网络模型的复杂度、减少运算时间。该网络基于编解码结构,以端到端的方式估计单目图像的深度图。编码端使用ResNet50网络结构;在解码端提出了一种轻量金字塔解码模块,采用深度空洞可分离卷积和分组卷积以提升感受野范围,同时减少了参数量,并且采用金字塔结构融合不同感受野下的特征图以提升解码模块的性能;此外,在解码模块之间增加跳跃连接实现知识共享,以提升网络的估计精度。在NYUD v2数据集上的实验结果表明,与结构注意力引导网络相比,轻量金字塔解码结构的单目深度估计网络在误差RMS的指标上降低约11.0%,计算效率提升约84.6%。
关键词 单目深度估计; 卷积神经网络; 编解码结构; 轻量金字塔解码
基金项目 国家自然科学基金面上项目(61673021)
北方工业大学学生科技活动资助项目
本文URL http://www.arocmag.com/article/01-2021-01-059.html
英文标题 Monocular depth estimation based on light-weight pyramid decoder convolution neural network
作者英文名 Jia Ruiming, Li Tong, Li Yang, Wang Yiding
机构英文名 School of Information Science & Technology,North China University of Technology,Beijing 100144,China
英文摘要 This paper proposed a light-weight pyramid decoder convolution neural network(LPDNet) for monocular depth estimation, which could reduce the complexity and the computation time of the network model while ensuring the estimation accuracy. LPDNet was based on encoder-decoder structure to estimate the depth map of a monocular image in an end-to-end manner. The encoder network adopted ResNet50. The main part of decoder network was light-weight pyramid decoder(LPD) module, which learned representations from a large receptive field with fewer parameters by using depth-wise dilated separable convolutions and group convolutions. LPD module fused feature maps of different receptive fields through pyramid structure. Besides, in order to perform better knowledge sharing for estimation accuracy, it added deconvolution skip connection between adjacent decoder modules. Experiments on NYUD v2 dataset demonstrate that compared with the structured attention guided network in CVPR2018, the error of LPDNet is reduced by about 11.0% in RMS, and computational efficiency is about 84.6% higher.
英文关键词 monocular depth estimation; convolution neural network; encoder-decoder; light-weight pyramid decoder
参考文献 查看稿件参考文献
 
收稿日期 2019/9/28
修回日期 2019/11/19
页码 293-297
中图分类号 TP391.41
文献标志码 A