- 无标题文档
查看论文信息

中文题名:

 交通场景下基于深度学习的目标检测和图像分割研究    

姓名:

 刘彬    

学号:

 18011210267    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085401    

学科名称:

 工学 - 电子信息 - 电子与通信工程    

学生类型:

 硕士    

学位:

 工程硕士    

学校:

 西安电子科技大学    

院系:

 通信工程学院    

专业:

 电子与通信工程    

研究方向:

 深度学习在交通场景下的应用    

第一导师姓名:

 陈晨    

第一导师单位:

  西安电子科技大学    

第二导师姓名:

 王皓    

完成日期:

 2021-05-10    

答辩日期:

 2021-05-20    

外文题名:

 Research on Object Detection and Image Segmentation Based on Deep Learning in Traffic Scene    

中文关键词:

 智能交通系统 ; 自动驾驶 ; 目标检测 ; 图像分割 ; YOLOv4 ; DeepLabv3+    

外文关键词:

 Intelligent Transportation System ; Automatic Driving ; Object Detection ; ; Image Segmentation ; ; YOLOv4 ; ; DeepLabv3+    

中文摘要:

随着计算机技术的不断进步,基于深度学习的人工智能技术高速发展,已经被广泛应用在交通场景下。尤其在智能交通系统和自动驾驶场景中,通过使用目标检测算法对采集到的视频进行检测,不仅可以获取车辆、行人的基本信息,还可获取相应的流量信息,并预测其运动趋势。除此之外,我们还需对周围环境做出准确感知,因此通过图像分割技术来从道路图像中获取周围环境信息,从而区分行人、车辆、障碍物、车道线和可行驶区域等重要交通信息。

传统目标检测和图像分割方法需要手工设计特征,这严重影响了检测和分割的性能。基于深度学习的目标检测和图像分割方法通过加深神经网络的层次来提取更丰富的特征信息,可显著提升检测和分割的性能。在交通场景中,作为最重要的目标之一,车辆由于其自身和环境特点,导致检测网络的特征提取难度大,往往出现漏检、误检、检测精度不高等问题。而在图像分割方面,当前流行的方法通常在速度和精度上难以兼顾,无法满足交通场景的实时性要求。

本文针对交通场景下基于深度学习的车辆检测和图像分割技术分别进行了深入研究,在已有的比较成熟的深度学习网络的基础上,提出了有效的改进方法,具体的研究工作如下:

(1)提出了一种基于改进YOLOv4的车辆检测算法。针对YOLOv4目标检测算法在检测车辆时特征提取不够充分的问题,通过引入ECA注意力机制、高分辨率网络HRNet等对YOLOv4目标检测网络进行了改进,显著提高了网络的特征提取能力。实验结果表明,本文所提出的改进方法有效改善了YOLOv4在车辆检测时的漏检、误检问题,提高了检测精度,与此同时保证了检测的实时性。

(2)提出了一种基于改进DeepLabv3+的交通场景图像分割算法。在交通场景图像分割任务中,针对传统池化操作造成的图像信息丢失过多的问题,使用SoftPool池化方法替换原有的池化操作。此外,针对图像分割算法运行速度较慢,无法满足实时分割要求的问题,使用MobileNetV2作为DeepLabv3+网络的主干特征提取网络。实验结果表明,本文所提出方法在提升了图像分割精度的同时保证了分割速度,有效提升了分割效果。

外文摘要:

With the continuous advancement of computer technology, the artificial intelligence based on deep learning develops rapidly and has been widely applied in trasportation scenarios. Especially in the scenarios of Intelligent Transportation System and Automatic Driving, by using object detection algorithm to detect the collected videos, it can obtain not only the basic information of vehicles and pedestrians, but also the corresponding traffic flow information, and movement trend can be predicted. In addition, we also need to accurately perceive the surrounding environment. Therefore, image segmentation technology are used to accurately obtain surrounding environment information from road images to distinguish pedestrians, vehicles, obstacles, lane lines and driving areas and other important information.

 

Traditional methods of object detection and image segmentation need to design features manually, which seriously affects the performance of detection and segmentation. The object detection and image segmentation method based on deep learning extract richer feature information by deepening the level of neural networks, which can significantly improve the performance of detection and segmentation. In traffic scene, vehicles are one of the most important targets, but it is difficult to extract features from the detection network due to their own and environmental characteristics, and problems such as missed detection, false detection and low detection accuracy often occur.

 

In this thesis, aiming at the traffic scenes, the vehicle detection and image segmentation are taken as research task. On the basis of the mature deep learning network, some effective improvement methods are proposed. Specific research work is as follows:

 

(1) A vehicle detection algorithm based on improved YOLOv4 is proposed. Aiming at the problem of insufficient feature extraction in vehicle detection by YOLOv4 object detection algorithm, the YOLOv4 detection network is improved by introducing the ECA attention mechanism and the high resolution network HRNet, which significantly improves the feature extraction ability of the network. The experimental results show that the improved method proposed in this thesis can effectively improve the problem of missed detection and false detection in vehicle detection of YOLOv4. It improves the detection accuracy while ensuring the real-time detection.

 

(2) An image segmentation algorithm based on improved DeepLabv3+ is proposed. In the task of image segmentation in traffic scene, in order to solve the problem of excessive loss of image information caused by traditional pooling operation, the Softpool pooling method is used to replace the original pooling operation to improve the network. In addition, aiming at the problem that the slow running speed of image segmentation algorithm cannot meet the requirements of real-time segmentation, MobileNetV2 is used as the backbone feature extraction network of DeepLabv3+. Finally, the experimental results show that the proposed method improves the accuracy of image segmentation while ensuring the segmentation speed, and effectively improves the segmentation effect.

参考文献:
[1] 刘丽英, 潘景山, 史永, et al. 智能交通系统的发展状况研究[J]. 信息技术与信息化, 2005(6):17-18.
[2] 乔维高, 徐学进. 无人驾驶汽车的发展现状及方向[J]. 上海汽车, 2007(07):40-43.
[3] 王笑京.中国智能交通发展回眸(二)——对中国智能交通起步有重要影响的几次国际交流[J].中国交通信息化,2019(05):18-25.
[4] 孙巍, 张捷, 穆文浩,等. 典型国家和地区自动驾驶汽车发展概述[J]. 汽车与安全, 2016(2):86-89.
[5] 高金辉,陈玉珠,汪晓晨.多传感器信息融合技术在智能火灾报警系统中的应用[J].传感器世界,2008(06):41-44.
[6] 刘泽莹.计算机信息技术在智能交通系统中的应用[J].理科爱好者(教育教学),2019(01):138.
[7] 姚彬.计算机视觉技术在智能交通系统中的应用[J].电子技术与软件工程,2016(16):170.
[8] 凌万利. 基于GPRS的无线传输技术在智能交通系统中的应用研究[D].哈尔滨:哈尔滨工程大学,2006.
[9] 秦贵和,葛安林,雷雨龙.智能交通系统及其车辆自动控制技术[J].汽车工程,2001(02):92-96.
[10] Arabi S, Haghighat A, Sharma A. A deep‐learning‐based computer vision solution for construction vehicle detection[J]. Computer‐Aided Civil and Infrastructure Engineering, 2020, 35(7): 753-767.
[11] Ghoreyshi A M, AkhavanPour A, Bossaghzadeh A. Simultaneous Vehicle Detection and Classification Model based on Deep YOLO Networks[C]. 2020 International Conference on Machine Vision and Image Processing (MVIP). Qom, Iran, 2020: 1-6.
[12] Ye C L K, Jo H S, Jo R S. Development of UAV-based Automated Vehicle Recognition System for Parking Enforcement[C]. 2019 4th International Conference on Robotics and Automation Engineering (ICRAE). IEEE, Singapore, 2019: 111-115.
[13] Xiang Y, Fu Y, Huang H. Global relative position space based pooling for fine-grained vehicle recognition[J]. Neurocomputing, 2019, 367: 287-298.
[14] Wang J, Li A, Pang Y. Improved Multi-domain Convolutional Neural Networks Method for Vehicle Tracking[J]. International Journal on Artificial Intelligence Tools, 2020, 29(07n08): 2040022.
[15] Ma J, Tian Z, Li Y, et al. Vehicle Tracking Method in Polar Coordinate System Based on Radar and Monocular Camera[C]. 2020 Chinese Control And Decision Conference (CCDC). IEEE, Heifei, China, 2020: 93-98.
[16] Zhang X N , Yao M , F Zhu, et al. Traffic Image Segmentation Based on Gaussian Mixture Model with Spatial Information and Sampling[J]. Applied Mechanics and Materials, 2013, 380-384:3702-3705.
[17] Ding L, Zhang H, Xiao J, et al. A lane detection method based on semantic segmentation[J]. Computer Modeling in Engineering & Sciences, 2020, 122(3): 1039-1053.
[18] Frickenstein A, Vemparala M R, Mayr J, et al. Binary DAD-Net: Binarized driveable area detection network for autonomous driving[C]. 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020: 2295-2301.
[19] Chen J, Wang H. An Obstacle Detection Method for USV by Fusing of Radar and Motion Stereo[C]. 2020 IEEE 16th International Conference on Control & Automation (ICCA). IEEE, 2020: 159-164.
[20] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436-444.
[21] 高星文. 交通检测器在高速公路中的应用及评价[J]. 山西建筑, 2002, 28(11):125-126.
[22] 陈超. 运动目标检测算法及其应用研究[D]. 武汉:武汉理工大学, 2011.
[23] Lindeberg T . Scale Invariant Feature Transform[J]. Scholarpedia, 2012, 7(5):2012 - 2021.
[24] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]. 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). IEEE, 2005, 1: 886-893.
[25] Bay H, Tuytelaars T, Van Gool L. Surf: Speeded up robust features[C]. European conference on computer vision. Springer, Berlin, Heidelberg, 2006: 404-417.
[26] Suthaharan S. Support vector machine[M]. Machine learning models and algorithms for big data classification. Springer, Boston, MA, 2016: 207-235.
[27] Ratsch G . Soft Margins for AdaBoost[J]. Machine Learning, 2001, 42(3):287-320.
[28] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25: 1097-1105.
[29] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
[30] He K , Zhang X , Ren S , et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 37(9):1904-1916.
[31] Girshick R. Fast r-cnn[C]. Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.
[32] Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]. Advances in neural information processing systems. 2015: 91-99.
[33] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.
[34] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
[35] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.
[36] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[37] Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[38] 韩思奇, 王蕾. 图像分割的阈值法综述[J]. 系统工程与电子技术, 2002, 24(6):91-94.
[39] 严学强, 叶秀清, 刘济林,等. 基于量化图像直方图的最大熵阈值处理算法[J]. 模式识别与人工智能, 1998(03):352-358.
[40] 薛景浩, 章毓晋, 林行刚. 基于最大类间后验交叉熵的阈值化分割算法[J]. 中国图象图形学报, 1999, 4(2):110-114.
[41] Rosenfeld A. The max Roberts operator is a Hueckel-type edge detector[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1981 (1): 101-103.
[42] Hong L, Wan Y, Jain A. Fingerprint image enhancement: algorithm and performance evaluation[J]. IEEE transactions on pattern analysis and machine intelligence, 1998, 20(8): 777-789.
[43] Reuter M, Biasotti S, Giorgi D, et al. Discrete Laplace–Beltrami operators for shape analysis and segmentation[J]. Computers & Graphics, 2009, 33(3): 381-390.
[44] 王广君, 田金文, 柳健,等. 基于四叉树结构的图象分割技术[C]. 全国光电技术学术交流会. 中国宇航学会, 2000.
[45] 刘宁宁, 田捷. 基于区域特征的交互式图像分割方法及其应用[J]. 软件学报, 1999, 010(003):235.
[46] 王楠, 黄养成. 一种改进的彩色图像区域分割和边缘提取算法[J]. 装备学院学报, 1999(04):106-110.
[47] Achanta R, Shaji A, Smith K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE transactions on pattern analysis and machine intelligence, 2012, 34(11): 2274-2282.
[48] Likas A, Vlassis N, Verbeek J J. The global k-means clustering algorithm[J]. Pattern recognition, 2003, 36(2): 451-461.
[49] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.
[50] Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495.
[51] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2015.
[52] Lin G, Milan A, Shen C, et al. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1925-1934.
[53] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2881-2890.
[54] Lafferty J , Mccallum A , Pereira F C N . Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]. Proc. 18th International Conf. on Machine Learning. 2001.
[55] Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(4): 834-848.
[56] Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv:1706.05587, 2017.
[57] Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation[C]. International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015: 234-241.
[58] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]. Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.
[59] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759-8768.
[60] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
[61] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.
[62] Wang Q , Wu B , Zhu P , et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks[J]. 2019.
[63] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2020.
[64] Chen L C , Zhu Y , Papandreou G , et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation[J]. Springer, Cham, 2018.
[65] Stergiou A, Poppe R, Kalliatakis G. Refining activation downsampling with SoftPool[J]. arXiv preprint arXiv:2101.00440, 2021.
[66] Sandler M , Howard A , Zhu M , et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018: 4510-4520.
中图分类号:

 U49    

馆藏号:

 49274    

开放日期:

 2021-12-15    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式