《计算机应用研究》|Application Research of Computers

一种加强SSD小目标检测能力的Atrous滤波器设计

Design of Atrous filter to strengthen small object detection capability of SSD

免费全文下载 (已被下载 次)  
获取PDF全文
作者 温捷文,战荫伟,李楚宏,卢剑彪
机构 广东工业大学 计算机学院,广州 510006
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)03-043-0861-05
DOI 10.19734/j.issn.1001-3695.2017.09.0967
摘要 针对实时目标检测SSD(single shot multibox detector)算法对小目标检测能力偏差的问题,提出了一种提高特征图分辨率的Atrous滤波器设计策略。改进算法在SSD网络结构的基础上,把第三、四层卷积层产生的特征图经过规范化后连接在一起,然后通过Atrous卷积运算提高这些特征图分辨率。这些特征图共同提供小目标所需的特征。另外该SSD改进算法还加入SeLU(scaled exponential linear units)激活函数,并在数据预处理阶段设计了一套数据增广方法。实验表明,该改进算法框架相对于原SSD算法框架具有更高的检测精度、更优良的鲁棒性,以及在小目标检测上效果明显。
关键词 SSD;目标检测;Atrous
基金项目
本文URL http://www.arocmag.com/article/01-2019-03-043.html
英文标题 Design of Atrous filter to strengthen small object detection capability of SSD
作者英文名 Wen Jiewen, Zhan Yinwei, Li Chuhong, Lu Jianbiao
机构英文名 SchoolofComputerScience&Technology,GuangdongUniversityofTechnology,Guangzhou510006,China
英文摘要 In order to overcome the shortcomings that SSD(single shot multibox detector) can not detect small objects well, this paper proposed an Atrous filter design strategy, which could strengthen the resolution of feature maps.The improved algorithm concatenated the feature maps that generated by the third and fourth convolution layer after normalization, and then improved the resolution of these feature maps by Atrous computed.The concatenated feature maps provided the required features for small objects.In addition, the SSD improved algorithm also added SeLU(scaled exponential linear units) activation function and designed a data augmented methods in the data preprocessing phase.The experimental results shows that the proposal algorithm has higher detection accuracy and better robustness than the original SSD algorithm.Furthermore, the detection perfor-mance obvious better on small target detection.
英文关键词 SSD; object detection; Atrous
参考文献 查看稿件参考文献
  [1] Felzenszwalb P F, Girshick R B, McAllester D, et al. Object detection with discriminatively trained part-based models[J] . IEEE Trans on Pattern Analysis and Machine Intelligence, 2010, 32(9):1627-1645.
[2] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C] // Proc of Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2005:886-893.
[3] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] // Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2014:580-587.
[4] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J] . IEEE Trans on Pattern Analysis and Machine Intelligence, 2015, 37 (9):1904-1916.
[5] Girshick R. Fast R-CNN[C] // Proc of IEEE International Conference on Computer Vision. Piscataway, NJ:IEEE Press, 2015:1440-1448.
[6] Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C] // Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2015:91-99.
[7] Dai Jifeng, Li Yi, He Kaiming, et al. R-FCN:object detection via region-based fully convolutional networks[C] // Proc of Neural Information Processing Systems. 2016:379-387.
[8] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[EB/OL] . (2016-12-09). [2017-04-19] . https://arxiv. org/abs/1612. 03144v2.
[9] He Kaiming, Gkioxari G, Dollár P, et al. Mask R-CNN[EB/OL] . (2017-03-20). [2018-01-24] . https://arxiv. org/abs/1703. 06870.
[10] Uijlings J R R, Van De Sande K E A, Gevers T, et al. Selective search for object recognition[J] . International Journal of Computer Vision, 2013, 104 (2):154-171.
[11] Zitnick C L, Dollár P. Edge boxes:locating object proposals from edges[C] // Proc of European Conference on Computer Vision. Berlin:Springer International Publishing, 2014:391-405.
[12] Redmon J, Divvala S, Girshick R, et al. You only look once:unified, real-time object detection[C] // Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:779-788.
[13] Liu Wei, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C] // Proc of European Conference on Computer Vision. Berlin:Springer International Publishing, 2016:21-37.
[14] Joseph R, Farhadi A. YOLO9000:better, faster, stronger[EB/OL] . (2016-12-25). https://arxiv. org/abs/1612. 08242.
[15] Fu Chengyang, Liu Wei, Ranga A, et al. DSSD:deconvolutional single shot detector[EB/OL] . (2017-01-23). https://arxiv. org/abs/1701. 06659.
[16] Chen L C, Papandreou G, Kokkinos I, et al. DeepLab:semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[EB/OL] . (2016-06-02). [2017-05-12] . https://arxiv. org/abs/1606. 00915.
[17] Bell S, Zitnick C L, Bala K, et al. Inside-outside net:detecting objects in context with skip pooling and recurrent neural networks[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:2874-2883.
[18] Kong Tao, Yao Anbang, Chen Yurong, et al. HyperNet:towards accurate region proposal generation and joint object detection[C] // Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:845-863.
[19] Cai Zhaowei, Fan Quanfu, Rogerio S, et al. A unified multi-scale deep convolutional neural network for fast object detection[C] //Proc of European Conference on Computer Vision. Berlin:Springer International Publishing, 2016:354-370.
[20] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J] . 计算机学报, 2017, 40(6):1229-1251. (Zhou Feiyan, Jin Linpeng, Dong Jun. A review of convolutional neural networks[J] . Chinese Journal of Computer, 2017, 40(6):1229-1251. )
[21] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C] //Advances in Neural Information Processing Systems. USA :Curran Associates Inc, 2012:1097-1105.
[22] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL] . (2014-09-04). [2015-04-10] . https://arxiv. org/abs/1409. 1556.
[23] Szegedy C, Liu Wei, Jia Yangqing, et al. Going deeper with convolutions[C] // Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2015:1-9.
[24] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:770-778.
[25] Klambauer G, Unterthiner T, Mayr A, et al. Self-normalizing neural networks[EB/OL] . (2017-07-08). [2017-09-07] . https://arxiv. org/abs/1706. 02515.
[26] Lenc K, Vedaldi A. R-CNN minus R[EB/OL] . (2015-06-23). https://arxiv. org/abs/1506. 06981.
收稿日期 2017/9/20
修回日期 2017/12/12
页码 861-865,872
中图分类号 TP305
文献标志码 A