查看论文信息

中文题名：	基于卷积神经网络的骨架提取算法设计与优化
姓名：	孙书怀
学号：	17031211583
保密级别：	公开
论文语种：	chi
学科代码：	081202
学科名称：	工学 - 计算机科学与技术（可授工学、理学学位） - 计算机软件与理论
学生类型：	硕士
学位：	工学硕士
学校：	西安电子科技大学
院系：	计算机科学与技术学院
专业：	计算机科学与技术
研究方向：	计算机科学与技术
第一导师姓名：	裘雪红
第一导师单位：	西安电子科技大学
完成日期：	2020-03-01
答辩日期：	2020-05-23
外文题名：	Design and Optimization of Skeleton Extraction Algorithm Based on Convolutional Neural Network
中文关键词：	骨架提取 ; 卷积神经网络 ; 残差网络 ; 骨架尺度
外文关键词：	Skeleton extraction ; Convolutional neural network ; Residual network ; Skeleton scale
中文摘要：	︿图像数据与日俱增，海量视觉信息的输入对处理算法提出了更高的要求。作为图像紧凑表示的骨架可以简洁地展示图像前景的形态，在注重物体形状和对数据量要求比较严苛的领域发挥着重要作用。骨架提取技术成为了研究的热点。随着深度学习的发展，基于卷积神经网络的骨架提取算法进一步提升了算法的准确度。但是随着网络的加深，算法运行时间与内存占用快速增长，为了平衡训练消耗，需要在有限深度的网络上设计优化算法。针对现有算法提取骨架准确率不高的情况，本文对基于卷积神经网络的骨架提取算法进行研究与分析，提出相应的改进优化算法。根据在有限深度网络上可以有效提高学习能力的侧输出残差网络以及可以减少监督误差的可量化参数的骨架尺度，本文提出了一种新的端到端的骨架提取算法，结合骨架尺度信息的侧输出残差网络 FSRN。该算法以 VGG-16 网络为基本框架，修改网络使其带有骨架尺度信息。然后从解码角度，通过降维、上采样、求和等操作逐阶段地向上融合特征。随后通过实验结果验证了算法的可行性。为了提高算法的准确性，本文通过堆叠 FSRN 单元形成阶梯结构，提出多路多监督的骨架提取改进方案，并增加监督层加速算法收敛。设计多组实验对比分析不同监督数量、不同融合路数、不同融合方向对算法性能的影响，选取综合表现最优的 FSRN 架构。为了进一步提升算法性能，从编码角度融合网络同阶段的相邻特征以增强特征表现力，剔除冗余浅层特征来提高算法运行效率。通过实验对比，验证了各种优化方案的有效性。实验结果表明，本文最终提出的融合同阶段相邻特征的自下而上 3 路 3 层 FSRN 可以有效提取图像中的物体骨架，并在多个公开数据集上具有比侧输出残差网络、融合尺度相关侧输出网络以及二次级联特征整合网络等算法更高的识别准确性，同时在收敛速度、资源消耗等评价指标上都具有一定的优势。﹀
外文摘要：	︿ With the increasing of image data, the input of massive visual information puts forward higher requirements for the processing algorithm. As a compact representation of image, skeleton can display the shape of image foreground concisely, which plays an important role in the field of focusing on object shape and requiring small amount of data. Skeleton extraction technology has become a research hotspot. With the development of deep learning, the skeleton extraction algorithm based on convolutional neural network further improves the accuracy of the algorithm. However, with the deepening of the network, the running time and memory consumption of the algorithm increase rapidly. In order to balance the training consumption, it is necessary to design optimization algorithm on the network with limited depth. In view of the low accuracy of the existing algorithm, this paper studies and analyzes the skeleton extraction algorithm based on convolution neural network, and proposes the corresponding improved optimization algorithm. According to the side output residual network which can improve the learning ability and the skeleton scale which can reduce the supervision error, this paper proposes a new end-toend skeleton extraction algorithm, Fusing Scale-associated Side Outputs Residual Network(FSRN). This algorithm takes vgg-16 network as the basic framework, and modifies the network with skeleton scale information. Then from the decoding point of view, through the operations of dimensionality reduction, up sampling, summing and so on, the features are fused step by step. The experimental results show the feasibility of the algorithm. In order to improve the accuracy of the algorithm, this paper proposes a multi-path and multi -supervision skeleton extraction improvement scheme. By stacking FSRN units to form a ladder structure, the accuracy of the algorithm is improved, and the supervision layer is added to accelerate the convergence of the algorithm. In this paper, several groups of experiments are designed to compare and analyze the influence of different supervision number, different fusion path number and different fusion direction on the algorithm performance, and select the FSRN architecture with the best comprehensive performance. In order to further improve the performance of the algorithm, this paper combines the adjacent features of the same stage of the network from the coding point of view to enhance 西安电子科技大学硕士学位论文 IV the performance of the features and eliminate the redundant shallow features to improve the efficiency of the algorithm. Through the comparison of experiments, the effectiveness of various optimization schemes is verified. The experimental results show that the proposed three-way three-layer FSRN with adjacent features fusion can effectively extract the skeleton of the object in image, and has higher recognition accuracy than the existing algorithms such as side-output residual network, fusing scale-associated deep side outputs network and two level hierarchical feature integration network, and has certain advantages in convergence speed, resource consumption and other evaluation indexes. ﹀
参考文献：	︿ [1] Saha P K, Borgefors G, di Baja G S. A survey on skeletonization algorithms and their applications[J]. Pattern Recognition Letters, 2016, 76: 3-12. [2] Fu H, Cao X, Tu Z, et al. Symmetry constraint for foreground extraction[J]. IEEE transactions on cybernetics, 2013, 44(5): 644-654. [3] Bai X , Latecki L J , Liu W Y . Skeleton Pruning by Contour Partitioning with Discrete Curve Evolution[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2007, 29(3):449-462. [4] Zeyun Yu, Chandrajit Bajaj. A segmentation-free approach for skeletonization of gray-scale images via anisotropic vector diffusion[P]. Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on,2004. [5] Jang J H, Hong K S. A pseudo-distance map for the segmentation-free skeletonization of gray-scale images[C]//Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on. IEEE, 2001, 2: 18-23. [6] Zhang Q, Couloigner I. Accurate centerline detection and line width estimation of thick lines using the radon transform[J]. IEEE Transactions on Image Processing, 2007, 16(2): 310-316. [7] Tsogkas S, Kokkinos I. Learning-based symmetry detection in natural images[C]//European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2012: 41-54. [8] Sironi A, Lepetit V, Fua P. Multiscale centerline detection by learning a scale-space distance transform[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014: 2697-2704. [9] Shen W, Bai X, Hu Z, et al. Multiple instance subspace learning via partial random projection tree for local reflection symmetry in natural images[J]. Pattern Recognition, 2016, 52(C):306-316. [10] Levinshtein A, Sminchisescu C, Dickinson S. Multiscale symmetric part detection and grouping[J]. International journal of computer vision, 2013, 104(2): 117-134. [11] Sie Ho Lee T, Fidler S, Dickinson S. Detecting curved symmetric parts using a deformable disc model[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 1753-1760. [12] Widynski N, Moevus A, Mignotte M. Local symmetry detection in natural images using a particle filtering approach[J]. IEEE Transactions on Image Processing, 2014, 23(12): 5309-5322. [13] Xie S, Tu Z. Holistically-nested edge detection[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1395-1403. [14] Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C]//European conference on computer vision. Springer, Cham, 2014: 818-833. [15] Ke W, Chen J, Jiao J, et al. SRN: side-output residual network for object symmetry detection in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 1068-1076. [16] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778. [17] Shen W, Zhao K, Jiang Y, et al. Object skeleton extraction in natural images by fusing scale-associated deep side outputs[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 222-230. [18] Shen W, Zhao K, Jiang Y, et al. Deepskeleton: Learning multi-task scale-associated deep side outputs for object skeleton extraction in natural images[J]. IEEE Transactions on Image Processing, 2017, 26(11): 5298-5311. [19] Liu X, Lyu P, Bai X, et al. Fusing image and segmentation cues for skeleton extraction in the wild[C]//Proc. ICCV Workshop on Detecting Symmetry in the Wild. 2017, 6: 8. [20] Zhao K, Shen W, Gao S, et al. Hi-Fi: Hierarchical Feature Integration for Skeleton Detection[J]. Proceedings of the Twenty-Seventh IJCAI Main track. 2018: 1191-1197. [21] 周志华. 机器学习[M]. 清华大学出版社, 2016. [22] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [23] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012: 1097-1105. [24] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. Computer ence, 2014. [25] Wang Y, Xu Y, Tsogkas S, et al. Deepflux for skeletons in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 5287-5296. [26] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440. [27] Hou Q, Liu J, Cheng M M, et al. Three birds one stone: a unified framework for salient object segmentation, edge detection and skeleton extraction[J]. arXiv preprint arXiv:1803.09860, 2018. [28] Borji A, Cheng M M, Hou Q, et al. Salient object detection: A survey[J]. Computational Visual Media, 2014: 1-34. [29] Jiang H, Cheng M M, Li S J, et al. Joint salient object detection and existence prediction[J]. Frontiers of Computer Science, 2019, 13(4): 778-788. [30] Dollar P, Tu Z, Belongie S. Supervised learning of edges and object boundaries[C]//2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). IEEE, 2006, 2: 1964-1971. [31] Ren X. Multi-scale improves boundary detection in natural images[C]//European conference on computer vision. Springer, Berlin, Heidelberg, 2008: 533-545. [32] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778. [33] Zhang Z, Shen W, Yao C, et al. Symmetry-based text line detection in natural scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 2558-2567. [34] Zou Q, Zhang Z, Li Q, et al. Deepcrack: Learning hierarchical convolutional features for crack detection[J]. IEEE Transactions on Image Processing, 2018, 28(3): 1498-1512. [35] Liu C, Ke W, Qin F, et al. Linear span network for object skeleton detection[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 133-148. [36] Wang W, Shen J, Cheng M M, et al. An iterative and cooperative top-down and bottom-up inference network for salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 5968-5977. [37] Liu Y, Cheng M M, Hu X, et al. Richer Convolutional Features for Edge Detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(8): 1939-1946. [38] Hou Q, Cheng M M, Hu X, et al. Deeply Supervised Salient Object Detection with Short Connections[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(4): 815-828. [39] Liu C, Ke W, Jiao J, et al. Rsrn: Rich side-output residual network for medial axis detection[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017: 1739-1743. [40] Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 801-818. [41] Gao S, Cheng M M, Zhao K, et al. Res2net: A new multi-scale backbone architecture[J]. IEEE transactions on pattern analysis and machine intelligence, 2019. [42] Demir I, Hahn C, Leonard K, et al. Skelneton 2019: Dataset and challenge on deep learning for geometric shape understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019. ﹀
中图分类号：	TP3
馆藏号：	45290
开放日期：	2020-12-19

附件下载