- 无标题文档














 工学 - 电子信息 - 电子与通信工程    
























 Research on Object Detection and Image Segmentation Based on Deep Learning in Traffic Scene    


 智能交通系统 ; 自动驾驶 ; 目标检测 ; 图像分割 ; YOLOv4 ; DeepLabv3+    


 Intelligent Transportation System ; Automatic Driving ; Object Detection ; ; Image Segmentation ; ; YOLOv4 ; ; DeepLabv3+    








With the continuous advancement of computer technology, the artificial intelligence based on deep learning develops rapidly and has been widely applied in trasportation scenarios. Especially in the scenarios of Intelligent Transportation System and Automatic Driving, by using object detection algorithm to detect the collected videos, it can obtain not only the basic information of vehicles and pedestrians, but also the corresponding traffic flow information, and movement trend can be predicted. In addition, we also need to accurately perceive the surrounding environment. Therefore, image segmentation technology are used to accurately obtain surrounding environment information from road images to distinguish pedestrians, vehicles, obstacles, lane lines and driving areas and other important information.


Traditional methods of object detection and image segmentation need to design features manually, which seriously affects the performance of detection and segmentation. The object detection and image segmentation method based on deep learning extract richer feature information by deepening the level of neural networks, which can significantly improve the performance of detection and segmentation. In traffic scene, vehicles are one of the most important targets, but it is difficult to extract features from the detection network due to their own and environmental characteristics, and problems such as missed detection, false detection and low detection accuracy often occur.


In this thesis, aiming at the traffic scenes, the vehicle detection and image segmentation are taken as research task. On the basis of the mature deep learning network, some effective improvement methods are proposed. Specific research work is as follows:


(1) A vehicle detection algorithm based on improved YOLOv4 is proposed. Aiming at the problem of insufficient feature extraction in vehicle detection by YOLOv4 object detection algorithm, the YOLOv4 detection network is improved by introducing the ECA attention mechanism and the high resolution network HRNet, which significantly improves the feature extraction ability of the network. The experimental results show that the improved method proposed in this thesis can effectively improve the problem of missed detection and false detection in vehicle detection of YOLOv4. It improves the detection accuracy while ensuring the real-time detection.


(2) An image segmentation algorithm based on improved DeepLabv3+ is proposed. In the task of image segmentation in traffic scene, in order to solve the problem of excessive loss of image information caused by traditional pooling operation, the Softpool pooling method is used to replace the original pooling operation to improve the network. In addition, aiming at the problem that the slow running speed of image segmentation algorithm cannot meet the requirements of real-time segmentation, MobileNetV2 is used as the backbone feature extraction network of DeepLabv3+. Finally, the experimental results show that the proposed method improves the accuracy of image segmentation while ensuring the segmentation speed, and effectively improves the segmentation effect.

[1] 刘丽英, 潘景山, 史永, et al. 智能交通系统的发展状况研究[J]. 信息技术与信息化, 2005(6):17-18.
[2] 乔维高, 徐学进. 无人驾驶汽车的发展现状及方向[J]. 上海汽车, 2007(07):40-43.
[3] 王笑京.中国智能交通发展回眸(二)——对中国智能交通起步有重要影响的几次国际交流[J].中国交通信息化,2019(05):18-25.
[4] 孙巍, 张捷, 穆文浩,等. 典型国家和地区自动驾驶汽车发展概述[J]. 汽车与安全, 2016(2):86-89.
[5] 高金辉,陈玉珠,汪晓晨.多传感器信息融合技术在智能火灾报警系统中的应用[J].传感器世界,2008(06):41-44.
[6] 刘泽莹.计算机信息技术在智能交通系统中的应用[J].理科爱好者(教育教学),2019(01):138.
[7] 姚彬.计算机视觉技术在智能交通系统中的应用[J].电子技术与软件工程,2016(16):170.
[8] 凌万利. 基于GPRS的无线传输技术在智能交通系统中的应用研究[D].哈尔滨:哈尔滨工程大学,2006.
[9] 秦贵和,葛安林,雷雨龙.智能交通系统及其车辆自动控制技术[J].汽车工程,2001(02):92-96.
[10] Arabi S, Haghighat A, Sharma A. A deep‐learning‐based computer vision solution for construction vehicle detection[J]. Computer‐Aided Civil and Infrastructure Engineering, 2020, 35(7): 753-767.
[11] Ghoreyshi A M, AkhavanPour A, Bossaghzadeh A. Simultaneous Vehicle Detection and Classification Model based on Deep YOLO Networks[C]. 2020 International Conference on Machine Vision and Image Processing (MVIP). Qom, Iran, 2020: 1-6.
[12] Ye C L K, Jo H S, Jo R S. Development of UAV-based Automated Vehicle Recognition System for Parking Enforcement[C]. 2019 4th International Conference on Robotics and Automation Engineering (ICRAE). IEEE, Singapore, 2019: 111-115.
[13] Xiang Y, Fu Y, Huang H. Global relative position space based pooling for fine-grained vehicle recognition[J]. Neurocomputing, 2019, 367: 287-298.
[14] Wang J, Li A, Pang Y. Improved Multi-domain Convolutional Neural Networks Method for Vehicle Tracking[J]. International Journal on Artificial Intelligence Tools, 2020, 29(07n08): 2040022.
[15] Ma J, Tian Z, Li Y, et al. Vehicle Tracking Method in Polar Coordinate System Based on Radar and Monocular Camera[C]. 2020 Chinese Control And Decision Conference (CCDC). IEEE, Heifei, China, 2020: 93-98.
[16] Zhang X N , Yao M , F Zhu, et al. Traffic Image Segmentation Based on Gaussian Mixture Model with Spatial Information and Sampling[J]. Applied Mechanics and Materials, 2013, 380-384:3702-3705.
[17] Ding L, Zhang H, Xiao J, et al. A lane detection method based on semantic segmentation[J]. Computer Modeling in Engineering & Sciences, 2020, 122(3): 1039-1053.
[18] Frickenstein A, Vemparala M R, Mayr J, et al. Binary DAD-Net: Binarized driveable area detection network for autonomous driving[C]. 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020: 2295-2301.
[19] Chen J, Wang H. An Obstacle Detection Method for USV by Fusing of Radar and Motion Stereo[C]. 2020 IEEE 16th International Conference on Control & Automation (ICCA). IEEE, 2020: 159-164.
[20] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436-444.
[21] 高星文. 交通检测器在高速公路中的应用及评价[J]. 山西建筑, 2002, 28(11):125-126.
[22] 陈超. 运动目标检测算法及其应用研究[D]. 武汉:武汉理工大学, 2011.
[23] Lindeberg T . Scale Invariant Feature Transform[J]. Scholarpedia, 2012, 7(5):2012 - 2021.
[24] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]. 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). IEEE, 2005, 1: 886-893.
[25] Bay H, Tuytelaars T, Van Gool L. Surf: Speeded up robust features[C]. European conference on computer vision. Springer, Berlin, Heidelberg, 2006: 404-417.
[26] Suthaharan S. Support vector machine[M]. Machine learning models and algorithms for big data classification. Springer, Boston, MA, 2016: 207-235.
[27] Ratsch G . Soft Margins for AdaBoost[J]. Machine Learning, 2001, 42(3):287-320.
[28] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25: 1097-1105.
[29] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
[30] He K , Zhang X , Ren S , et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 37(9):1904-1916.
[31] Girshick R. Fast r-cnn[C]. Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.
[32] Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]. Advances in neural information processing systems. 2015: 91-99.
[33] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.
[34] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
[35] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.
[36] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[37] Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[38] 韩思奇, 王蕾. 图像分割的阈值法综述[J]. 系统工程与电子技术, 2002, 24(6):91-94.
[39] 严学强, 叶秀清, 刘济林,等. 基于量化图像直方图的最大熵阈值处理算法[J]. 模式识别与人工智能, 1998(03):352-358.
[40] 薛景浩, 章毓晋, 林行刚. 基于最大类间后验交叉熵的阈值化分割算法[J]. 中国图象图形学报, 1999, 4(2):110-114.
[41] Rosenfeld A. The max Roberts operator is a Hueckel-type edge detector[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1981 (1): 101-103.
[42] Hong L, Wan Y, Jain A. Fingerprint image enhancement: algorithm and performance evaluation[J]. IEEE transactions on pattern analysis and machine intelligence, 1998, 20(8): 777-789.
[43] Reuter M, Biasotti S, Giorgi D, et al. Discrete Laplace–Beltrami operators for shape analysis and segmentation[J]. Computers & Graphics, 2009, 33(3): 381-390.
[44] 王广君, 田金文, 柳健,等. 基于四叉树结构的图象分割技术[C]. 全国光电技术学术交流会. 中国宇航学会, 2000.
[45] 刘宁宁, 田捷. 基于区域特征的交互式图像分割方法及其应用[J]. 软件学报, 1999, 010(003):235.
[46] 王楠, 黄养成. 一种改进的彩色图像区域分割和边缘提取算法[J]. 装备学院学报, 1999(04):106-110.
[47] Achanta R, Shaji A, Smith K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE transactions on pattern analysis and machine intelligence, 2012, 34(11): 2274-2282.
[48] Likas A, Vlassis N, Verbeek J J. The global k-means clustering algorithm[J]. Pattern recognition, 2003, 36(2): 451-461.
[49] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.
[50] Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495.
[51] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2015.
[52] Lin G, Milan A, Shen C, et al. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1925-1934.
[53] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2881-2890.
[54] Lafferty J , Mccallum A , Pereira F C N . Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]. Proc. 18th International Conf. on Machine Learning. 2001.
[55] Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(4): 834-848.
[56] Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv:1706.05587, 2017.
[57] Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation[C]. International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015: 234-241.
[58] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]. Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.
[59] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759-8768.
[60] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
[61] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.
[62] Wang Q , Wu B , Zhu P , et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks[J]. 2019.
[63] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2020.
[64] Chen L C , Zhu Y , Papandreou G , et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation[J]. Springer, Cham, 2018.
[65] Stergiou A, Poppe R, Kalliatakis G. Refining activation downsampling with SoftPool[J]. arXiv preprint arXiv:2101.00440, 2021.
[66] Sandler M , Howard A , Zhu M , et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018: 4510-4520.







   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式