1.武汉科技大学 信息科学与工程学院, 湖北 武汉430081
2.冶金自动化与检测技术教育部工程研究中心, 湖北 武汉430081
[ "刘雄彪(1998—),男,湖北监利人,硕士研究生,2021年于长江大学获得学士学位,主要从事深度学习、目标检测方面的研究。E-mail:549250933@qq.com" ]
[ "杨贤昭(1978—),男,博士,副教授,2012年于武汉科技大学获得博士学位,主要从事信号处理、电磁场与电磁波方面的研究。E-mail:yangxianzhao@wust.edu.cn" ]
扫 描 看 全 文
刘雄彪, 杨贤昭, 陈洋, 等. 基于CIoU改进边界框损失函数的目标检测方法[J]. 液晶与显示, 2023,38(5):656-665.
LIU Xiong-biao, YANG Xian-zhao, CHEN Yang, et al. Object detection method based on CIoU improved bounding box loss function[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(5):656-665.
刘雄彪, 杨贤昭, 陈洋, 等. 基于CIoU改进边界框损失函数的目标检测方法[J]. 液晶与显示, 2023,38(5):656-665. DOI: 10.37188/CJLCD.2022-0282.
LIU Xiong-biao, YANG Xian-zhao, CHEN Yang, et al. Object detection method based on CIoU improved bounding box loss function[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(5):656-665. DOI: 10.37188/CJLCD.2022-0282.
损失函数对于目标检测任务的检测精度和模型收敛速度具有重要作用,而损失函数中的边界框损失函数是影响检测结果和模型收敛速度的重要因素。针对传统模型定位精度低和训练时模型收敛慢的问题,本文在CIoU边界框损失函数的基础上提出一种改进的边界框损失函数,解决了CIoU损失函数求导过程中由边界框宽高比带来的梯度爆炸问题和模型提前退化的问题,并且引入重叠区域与目标框的宽高关系和中心点之间的归一化距离作为附加的惩罚项,提高了模型的检测精度和收敛速度,这种损失函数称为BCIoU(Better CIoU)。在PASACL VOC 2007数据集上的实验结果表明,改进的BCIoU边界框损失函数在YOLOv3网络下相对于IoU损失的mAP50指标相对提升了2.09%,AP指标相对提升了6.88%;相对于CIoU损失的mAP50指标相对提升了1.64%,AP指标相对提升了5.35%。模型的收敛速度也有一定程度的提升。本文提出的BCIoU损失函数提高了模型的检测精度和模型收敛速度,并且可以很方便地纳入到当前目标检测算法中。
The loss function plays an important role in the detection accuracy and model convergence speed of the object detection task, and the bounding box loss function in the loss function is an important factor affecting the detection results and model convergence speed. To address the problems of low accuracy for traditional model localization and slow convergence of the model during training, an improved bounding box loss function is proposed based on the CIoU bounding box loss function, which solves the problem of gradient explosion and early degradation of the model brought by the bounding box aspect ratio during the derivation of CIoU loss function, and introduces the normalized distance between the aspect relationship and centroid of the overlap region and the target box as additional penalty terms to improve the detection accuracy and convergence speed of the model, which is called BCIoU (Better CIoU). Experimental results on the PASACL VOC 2007 dataset show that the improved BCIoU bounding box loss function improves the IoU loss mAP50 metric by 2.09% and the AP metric by 6.88% relatively under the YOLOv3 network, the CIoU loss mAP50 metric improves by 1.64% and the AP metric improves by 5.35%. The convergence speed of the model is also improved to some extent. The proposed BCIoU loss function improves detection accuracy and convergence speed of the model, and can be easily incorporated into current object detection algorithms.
计算机视觉目标检测边界框回归梯度损失函数
computer visionobject detectionbounding box regressiongradientloss function
LIU L, OUYANG W L, WANG X G, et al. Deep learning for generic object detection: a survey [J]. International Journal of Computer Vision, 2020, 128(2): 261-318. doi: 10.1007/s11263-019-01247-4http://dx.doi.org/10.1007/s11263-019-01247-4
ZHAO Z Q, ZHENG P, XU S T, et al. Object detection with deep learning: a review [J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(11): 3212-3232. doi: 10.1109/tnnls.2018.2876865http://dx.doi.org/10.1109/tnnls.2018.2876865
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587. doi: 10.1109/cvpr.2014.81http://dx.doi.org/10.1109/cvpr.2014.81
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe Nevada: Curran Associates Inc., 2012: 1097-1105.
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. doi: 10.1109/tpami.2015.2389824http://dx.doi.org/10.1109/tpami.2015.2389824
GIRSHICK R. Fast R-CNN [C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448. doi: 10.1109/iccv.2015.169http://dx.doi.org/10.1109/iccv.2015.169
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [C]//Proceedings of the 3rd International Conference on Learning Representations. San Diego: ICLR, 2015: 384-388.
EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The Pascal visual object classes (VOC) challenge [J]. International Journal of Computer Vision, 2010, 88(2): 303-338. doi: 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/tpami.2016.2577031http://dx.doi.org/10.1109/tpami.2016.2577031
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788. doi: 10.1109/cvpr.2016.91http://dx.doi.org/10.1109/cvpr.2016.91
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 21-37. doi: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2
REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
REDMON J, FARHADI A. YOLOv3: an incremental improvement [J/OL]. arXiv, 2018: 1804.02767. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. doi: 10.1109/tpami.2018.2858826http://dx.doi.org/10.1109/tpami.2018.2858826
TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. doi: 10.1109/cvpr42600.2020.01079http://dx.doi.org/10.1109/cvpr42600.2020.01079
YU J H, JIANG Y N, WANG Z Y, et al. UnitBox: an advanced object detection network [C]//Proceedings of the 24th ACM international conference on Multimedia. Amsterdam: ACM, 2016: 516-520. doi: 10.1145/2964284.2967274http://dx.doi.org/10.1145/2964284.2967274
REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression [C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. doi: 10.1109/cvpr.2019.00075http://dx.doi.org/10.1109/cvpr.2019.00075
ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000. doi: 10.1609/aaai.v34i07.6999http://dx.doi.org/10.1609/aaai.v34i07.6999
董潇潇,何小海,吴晓红,等.基于注意力掩模融合的目标检测算法[J].液晶与显示,2019,34(8):825-833. doi: 10.3788/yjyxs20193408.0825http://dx.doi.org/10.3788/yjyxs20193408.0825
DONG X X, HE X H, WU X H, et al. Object detection algorithm based on attention mask fusion [J]. Chinese Journal of Liquid Crystals and Displays, 2019, 34(8): 825-833. (in Chinese). doi: 10.3788/yjyxs20193408.0825http://dx.doi.org/10.3788/yjyxs20193408.0825
唐悦,吴戈,朴燕.改进的GDT-YOLOv3目标检测算法[J].液晶与显示,2020,35(8):852-860. doi: 10.37188/yjyxs20203508.0852http://dx.doi.org/10.37188/yjyxs20203508.0852
TANG Y, WU G, PIAO Y. Improved algorithm of GDT-YOLOv3 image target detection [J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(8): 852-860. (in Chinese). doi: 10.37188/yjyxs20203508.0852http://dx.doi.org/10.37188/yjyxs20203508.0852
吴海滨,魏喜盈,刘美红,等.结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测[J].中国光学,2021,14(6):1417-1425. doi: 10.37188/CO.2021-0078http://dx.doi.org/10.37188/CO.2021-0078
WU H B, WEI X Y, LIU M H, et al. Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning [J]. Chinese Optics, 2021, 14(6): 1417-1425. (in Chinese). doi: 10.37188/CO.2021-0078http://dx.doi.org/10.37188/CO.2021-0078
0
浏览量
28
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构