1.贵州大学 大数据与信息工程学院, 贵州 贵阳 550025
[ "陈 娜(1995—),女,贵州遵义人,硕士研究生,2018年于天津工业大学获得学士学位,主要从事机器学习、图像处理等方面的研究。E-mail:1171847408@qq.com" ]
[ "刘宇红(1963—),男,贵州贵阳人,硕士,教授,1988年于贵州大学获得硕士学位,主要从事智能硬件、机器学习、人工智能、大数据采集与处理、物联网与云计算技术等方面的研究。E-mail:liuyuhongxy@sina.com" ]
扫 描 看 全 文
陈娜, 张荣芬, 刘宇红, 等. 基于类特征注意力机制融合的语义分割算法[J]. 液晶与显示, 2023,38(2):236-244.
CHEN Na, ZHANG Rong-fen, LIU Yu-hong, et al. Semantic segmentation algorithm based on class feature attention mechanism fusion[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(2):236-244.
陈娜, 张荣芬, 刘宇红, 等. 基于类特征注意力机制融合的语义分割算法[J]. 液晶与显示, 2023,38(2):236-244. DOI: 10.37188/CJLCD.2022-0199.
CHEN Na, ZHANG Rong-fen, LIU Yu-hong, et al. Semantic segmentation algorithm based on class feature attention mechanism fusion[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(2):236-244. DOI: 10.37188/CJLCD.2022-0199.
针对DeepLabv3+模型对图像目标边缘分割不准确、不同类目标分割不一致等问题,提出一种基于类特征注意力机制融合的语义分割算法。该算法在DeepLabv3+模型编码端先设计一个类特征注意力模块增强类别间的相关性,更好地提取和处理不同类别的语义信息。然后采用多级并行的空间金字塔池化结构增强空间之间的相关性,更好地提取图像不同尺度的上下文信息。最后在解码端利用通道注意力模块的特性对多层融合特征重新校准,抑制冗余信息,加强显著特征来提高网络的表征能力。在Pascal Voc2012和Cityscapes数据集上对改进模型进行了有效性和泛化性实验,平均交并比分别达到了81.34%和76.27%,使图像边缘分割更细致,类别更清晰,显著优于本文对比算法。
Aiming at the problems of inaccurate segmentation of image target edges and inconsistent segmentation of different types of targets by the DeepLabv3+ model, a semantic segmentation algorithm based on the fusion of class feature attention mechanism is proposed. The algorithm firstly designs a class feature attention module on the encoding end of the DeepLabv3+ model to enhance the correlation between categories, so as to better extract and process semantic information of different categories. Secondly, the multi-level parallel spatial pyramid pooling structure is used to enhance the correlation between spaces so as to better extract contextual information at different scales of images. Finally, at the decoding end, the characteristics of the channel attention module are used to recalibrate the multi-layer fusion features, and the redundant information is suppressed to strengthen the salient features to improve the representation ability of the network. The effectiveness and generalization experiments of the improved model are carried out on the Pascal Voc2012 and Cityscapes datasets, and the average intersection ratios reach 81.34% and 76.27%, respectively.The image edge segmentations in this paper are more detailed, the categories are clearer, and are significantly better than the compared algorithms.
多尺度特征融合类特征注意力机制语义分割DeepLabv3+
multi-scale feature fusionclass featureattention mechanismsemantic segmentationDeepLabv3+
KHAN M Z, GAJENDRAN M K, LEE Y, et al. Deep neural architectures for medical image semantic segmentation: review [J]. IEEE Access, 2021, 9: 83002-83024. doi: 10.1109/access.2021.3086530http://dx.doi.org/10.1109/access.2021.3086530
ZUO C, QIAN J M, FENG S J, et al. Deep learning in optical metrology: a review [J]. Light: Science & Applications, 2022, 11(1): 39. doi: 10.1038/s41377-022-00714-xhttp://dx.doi.org/10.1038/s41377-022-00714-x
HUANG L Q, LUO R C, LIU X, et al. Spectral imaging with deep learning [J]. Light: Science & Applications, 2022, 11(1): 61. doi: 10.1038/s41377-022-00743-6http://dx.doi.org/10.1038/s41377-022-00743-6
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440. doi: 10.1109/cvpr.2015.7298965http://dx.doi.org/10.1109/cvpr.2015.7298965
RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation [C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Munich: Springer, 2015: 234-241. doi: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28
BADRINARAYANAN V, HANDA A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling [J/OL]. arXiv, 2015: 1505.07293.
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6230-6239. doi: 10.1109/cvpr.2017.660http://dx.doi.org/10.1109/cvpr.2017.660
YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions [C]//Proceedings of the 4th International Conference on Learning Representations. San Juan: ICLR, 2016: 823-834. doi: 10.1109/cvpr.2017.75http://dx.doi.org/10.1109/cvpr.2017.75
CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 833-851. doi: 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. doi: 10.1109/tpami.2017.2699184http://dx.doi.org/10.1109/tpami.2017.2699184
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs [C]//Proceedings of the 3rd International Conference on Learning Representations. San Diego, 2015.
CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [J/OL]. arXiv, 2017: 1706.05587. doi: 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49
PAN X R, GAO L R, ZHANG B, et al. High-resolution aerial imagery semantic labeling with dense pyramid network [J]. Sensors, 2018, 18(11): 3774. doi: 10.3390/s18113774http://dx.doi.org/10.3390/s18113774
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141. doi: 10.1109/cvpr.2018.00745http://dx.doi.org/10.1109/cvpr.2018.00745
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 3-19. doi: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1
LI X T, YOU A S, ZHU Z, et al. Semantic flow for fast and accurate scene parsing [C]//Proceedings of the 16th European Conference on Computer Vision. Glasgow: Springer, 2020: 775-793. doi: 10.1007/978-3-030-58452-8_45http://dx.doi.org/10.1007/978-3-030-58452-8_45
ZENG H B, PENG S Q, LI D X. Deeplabv3+ semantic segmentation model based on feature cross attention mechanism [J]. Journal of Physics: Conference Series, 2020, 1678: 012106. doi: 10.1088/1742-6596/1678/1/012106http://dx.doi.org/10.1088/1742-6596/1678/1/012106
LIU Y F, ZHU Q G, CAO F, et al. High-resolution remote sensing image segmentation framework based on attention mechanism and adaptive weighting [J]. ISPRS International Journal of Geo-Information, 2021, 10(4): 241. doi: 10.3390/ijgi10040241http://dx.doi.org/10.3390/ijgi10040241
孟俊熙,张莉,曹洋,等.基于Deeplabv3+的图像语义分割算法优化研究[J].激光与光电子学进展,2022,59(16):161-170. doi: 10.3788/lop202259.1610009http://dx.doi.org/10.3788/lop202259.1610009
MENG J X, ZHANG L, CAO Y, et al. Optimization of image semantic segmentation algorithms based on Deeplab v3+ [J]. Laser & Optoelectronics Progress, 2022, 59(16): 161-170. (in Chinese). doi: 10.3788/lop202259.1610009http://dx.doi.org/10.3788/lop202259.1610009
任凤雷,何昕,魏仲慧,等.基于DeepLabv3+与超像素优化的语义分割[J].光学 精密工程,2019,27(12):2722-2729. doi: 10.3788/ope.20192712.2722http://dx.doi.org/10.3788/ope.20192712.2722
REN F L, HE X, WEI Z H, et al. Semantic segmentation based on DeepLabv3+ and superpixel optimization [J]. Optics and Precision Engineering, 2019, 27(12): 2722-2729. (in Chinese). doi: 10.3788/ope.20192712.2722http://dx.doi.org/10.3788/ope.20192712.2722
WANG Z M, WANG J S, YANG K, et al. Semantic segmentation of high-resolution remote sensing images based on a class feature attention mechanism fused with Deeplabv3+ [J]. Computers & Geosciences, 2022, 158: 104969. doi: 10.1016/j.cageo.2021.104969http://dx.doi.org/10.1016/j.cageo.2021.104969
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. doi: 10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90
WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation [C]//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 1451-1460. doi: 10.1109/wacv.2018.00163http://dx.doi.org/10.1109/wacv.2018.00163
史健锋,高志明,王阿川. 结合ASPP与改进HRNet的多尺度图像语义分割方法研究[J].液晶与显示,2021,36(11):1497-1505. doi: 10.37188/CJLCD.2021-0093http://dx.doi.org/10.37188/CJLCD.2021-0093
SHI J F, GAO Z M, WANG A C. Multi-scale image semantic segmentation based on ASPP and improved HRNet [J]. Chinese Journal of Liquid Crystals and Displays, 2021, 36(11): 1497-1505. (in Chinese). doi: 10.37188/CJLCD.2021-0093http://dx.doi.org/10.37188/CJLCD.2021-0093
ZHANG F, CHEN Y Q, LI Z H, et al. ACFNet: attentional class feature network for semantic segmentation [C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6797-6806. doi: 10.1109/iccv.2019.00690http://dx.doi.org/10.1109/iccv.2019.00690
祁欣,袁非牛,史劲亭,等. 多层次特征融合网络的语义分割算法[J/OL]. 计算机科学与探索:1-13[2022-06-14]. http://kns.cnki.net/kcms/detail/11.5602.TP.20211014.1148.002.htmlhttp://kns.cnki.net/kcms/detail/11.5602.TP.20211014.1148.002.html.
QI X, YUAN F N, SHI J T, et al. Semantic segmentation algorithm of multi-level feature fusion network [J/OL]. Journal of Frontiers of Computer Science and Technology: 1-13[2022-06-14]. http://kns.cnki.net/kcms/detail/11.5602.TP.20211014.1148.002.html.http://kns.cnki.net/kcms/detail/11.5602.TP.20211014.1148.002.html.(in Chinese)
0
浏览量
84
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构