1.贵州大学 大数据与信息工程学院, 贵州 贵阳 550025
2.贵阳铝镁设计研究院有限公司, 贵州 贵阳 550009
[ "蒋诗怡(1998—),女,贵州镇远人,硕士研究生,2021年于南京师范大学获得学士学位,主要从事计算机图像处理中图像分割、深度学习方面的研究。E-mail:jiangsy1998jy@163.com" ]
[ "徐杨(1980—),男,贵州贵阳人,博士,副教授,2008年于中国科学院近代物理研究所获得博士学位,主要从事数据采集、机器学习方面的研究。E-mail:xuy@gzu.edu.cn" ]
扫 描 看 全 文
蒋诗怡, 徐杨, 李丹杨, 等. FRKDNet:基于知识蒸馏的特征提炼语义分割网络[J]. 液晶与显示, 2023,38(11):1590-1599.
JIANG Shi-yi, XU Yang, LI Dan-yang, et al. FRKDNet: feature refine semantic segmentation network based on knowledge distillation[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1590-1599.
蒋诗怡, 徐杨, 李丹杨, 等. FRKDNet:基于知识蒸馏的特征提炼语义分割网络[J]. 液晶与显示, 2023,38(11):1590-1599. DOI: 10.37188/CJLCD.2023-0010.
JIANG Shi-yi, XU Yang, LI Dan-yang, et al. FRKDNet: feature refine semantic segmentation network based on knowledge distillation[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1590-1599. DOI: 10.37188/CJLCD.2023-0010.
传统的语义分割知识蒸馏方法仍然存在知识蒸馏不完全、特征信息传递不显著等问题,且教师网络传递的知识情况复杂,容易丢失特征的位置信息。针对以上问题,本文提出了一种基于知识蒸馏的特征提炼语义分割模型FRKDNet。首先根据前景特征与背景噪声的特点,设计了一种特征提炼方法来将蒸馏知识中的前景内容进行分离,过滤掉教师网络的伪知识后将更准确的特征内容传递给学生网络,从而提高特征的表现能力。同时,在特征空间的隐式编码中提取类间距离与类内距离从而得到相应的特征坐标掩码,学生网络通过模拟特征位置信息来最小化与教师网络特征位置的差距,并分别和学生网络进行蒸馏损失计算,从而提高学生网络的分割精度,辅助学生网络更快地收敛。最后在公开数据集Pascal VOC和Cityscapes上实现了优秀的分割性能,MIoU分别达到74.19%和76.53%,比原始学生网络分别提高了2.04%和4.48%。本文方法相比于主流方法具有更好的分割性能和鲁棒性,为语义分割知识蒸馏提供了一种新方法。
The traditional semantic segmentation knowledge distillation schemes still have problems such as incomplete distillation and insignificant feature information transmission which affect the performance of network, and the complex situation of knowledge transferred by teachers' network which makes it easy to lose the location information of feature. To solve these problems, this paper presents feature refine semantic segmentation network based on knowledge distillation. Firstly, a feature extraction method is designed to separate the foreground content and background noise in the distilled knowledge, and the pseudo knowledge of the teacher network is filtered out to pass more accurate feature content to the student network, so as to improve the performance of the feature. At the same time, the inter-class distance and intra-class distance are extracted in the implicit encoding of the feature space to obtain the corresponding feature coordinate mask. Then, the student network minimizes the output of the feature location with the teacher network by simulating the feature location information, and calculates the distillation loss with the student network respectively, so as to improve the segmentation accuracy of the student network and assist the student network to converge faster. Finally, excellent segmentation performance is achieved on the public datasets Pascal VOC and Cityscapes, and the MIoU reaches 74.19% and 76.53% respectively, which is 2.04% and 4.48% higher than that of the original student network. Compared with the mainstream methods, the method in this paper has better segmentation performance and robustness, and provides a new method for semantic segmentation knowledge distillation.
语义分割神经网络知识蒸馏特征提炼深度学习
semantic segmentationneural networkknowledge distillationfeature refinedeep learning
肖昌城,吴锡.基于门控卷积残差网络的卫星图像道路提取[J].计算机应用研究,2021,38(12):3820-3825.
XIAO C C, WU X. Road extraction from satellite image based on gated convolutional residual network [J]. Application Research of Computers, 2021, 38(12): 3820-3825. (in Chinese)
何淼楹,崔宇超.面向自动驾驶的交通场景语义分割[J].计算机应用,2021,41(S1):25-30. doi: 10.11772/j.issn.1001-9081.2020071114http://dx.doi.org/10.11772/j.issn.1001-9081.2020071114
HE M Y, CUI Y C. Semantic segmentation of traffic scenes for autonomous driving [J]. Journal of Computer Applications, 2021, 41(S1): 25-30. (in Chinese). doi: 10.11772/j.issn.1001-9081.2020071114http://dx.doi.org/10.11772/j.issn.1001-9081.2020071114
音松,陈雪云,贝学宇.改进Mask RCNN算法及其在行人实例分割中的应用[J].计算机工程,2021,47(6):271-276,283. doi: 10.19678/j.issn.1000-3428.0058058http://dx.doi.org/10.19678/j.issn.1000-3428.0058058
YIN S, CHEN X Y, BEI X Y. Improved mask RCNN algorithm and its application in pedestrian instance segmentation [J]. Computer Engineering, 2021, 47(6): 271-276, 283. (in Chinese). doi: 10.19678/j.issn.1000-3428.0058058http://dx.doi.org/10.19678/j.issn.1000-3428.0058058
王曦,于鸣,任洪娥.UNET与FPN相结合的遥感图像语义分割[J].液晶与显示,2021,36(3):475-483. doi: 10.37188/CJLCD.2020-0116http://dx.doi.org/10.37188/CJLCD.2020-0116
WANG X, YU M, REN H E. Remote sensing image semantic segmentation combining UNET and FPN [J]. Chinese Journal of Liquid Crystals and Displays, 2021, 36(3): 475-483. (in Chinese). doi: 10.37188/CJLCD.2020-0116http://dx.doi.org/10.37188/CJLCD.2020-0116
王浩桐,郭中华.改进SSD的飞机遥感图像目标检测[J].液晶与显示,2022,37(1):116-127. doi: 10.37188/CJLCD.2021-0203http://dx.doi.org/10.37188/CJLCD.2021-0203
WANG H T, GUO Z H. Improved SSD based aircraft remote sensing image target detection [J]. Chinese Journal of Liquid Crystals and Displays, 2022, 37(1): 116-127. (in Chinese). doi: 10.37188/CJLCD.2021-0203http://dx.doi.org/10.37188/CJLCD.2021-0203
PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation [J/OL]. arXiv, 2016:1606.02147. doi: 10.48550/arXiv.1606.02147http://dx.doi.org/10.48550/arXiv.1606.02147
ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images [C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 418-434. doi: 10.1007/978-3-030-01219-9_25http://dx.doi.org/10.1007/978-3-030-01219-9_25
HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [J/OL]. arXiv, 2015:1503.02531. doi: 10.5860/choice.189890http://dx.doi.org/10.5860/choice.189890
WANG Y K, ZHOU W, JIANG T, et al. Intra-class feature variation distillation for semantic segmentation [C]. 16th European Conference on Computer Vision. Glasgow: Springer, 2020: 346-362. doi: 10.1007/978-3-030-58571-6_21http://dx.doi.org/10.1007/978-3-030-58571-6_21
SHU C Y, LIU Y F, GAO J F, et al. Channel-wise knowledge distillation for dense prediction [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 5291-5300. doi: 10.1109/iccv48922.2021.00526http://dx.doi.org/10.1109/iccv48922.2021.00526
CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 3213-3223. doi: 10.1109/cvpr.2016.350http://dx.doi.org/10.1109/cvpr.2016.350
EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge [J]. International Journal of Computer Vision, 2010, 88(2): 303-338. doi: 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4
CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 833-851. doi: 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49
RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge [J]. International Journal of Computer Vision, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-yhttp://dx.doi.org/10.1007/s11263-015-0816-y
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. doi: 10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90
HE T, SHEN C H, TIAN Z, et al. Knowledge adaptation for efficient semantic segmentation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 578-587. doi: 10.1109/cvpr.2019.00067http://dx.doi.org/10.1109/cvpr.2019.00067
LIU Y F, SHU C Y, WANG J D, et al. Structured knowledge distillation for dense prediction [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 7035-7049. doi: 10.1109/tpami.2020.3001940http://dx.doi.org/10.1109/tpami.2020.3001940
MEHTA S, RASTEGARI M, CASPI A, et al. ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation [C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 561-580. doi: 10.1007/978-3-030-01249-6_34http://dx.doi.org/10.1007/978-3-030-01249-6_34
ROMERA E, ALVAREZ J M, BERGASA L M, et al. ERFNet: Efficient residual factorized ConvNet for real-time semantic segmentation [J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 19(1): 263-272. doi: 10.1109/tits.2017.2750080http://dx.doi.org/10.1109/tits.2017.2750080
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440. doi: 10.1109/cvpr.2015.7298965http://dx.doi.org/10.1109/cvpr.2015.7298965
LIN G S, MILAN A, SHEN C H, et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5168-5177. doi: 10.1109/cvpr.2017.549http://dx.doi.org/10.1109/cvpr.2017.549
YUAN Y H, CHEN X L, WANG J D. Object-contextual representations for semantic segmentation [C]. 16th European Conference on Computer Vision. Glasgow: Springer, 2020: 173-190. doi: 10.1007/978-3-030-58539-6_11http://dx.doi.org/10.1007/978-3-030-58539-6_11
ZHU Z, XU M D, BAI S, et al. Asymmetric non-local neural networks for semantic segmentation [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 593-602. doi: 10.1109/iccv.2019.00068http://dx.doi.org/10.1109/iccv.2019.00068
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6230-6239. doi: 10.1109/cvpr.2017.660http://dx.doi.org/10.1109/cvpr.2017.660
0
浏览量
7
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构