1.辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125000
2.中国兵器工业集团 航空弹药研究院有限公司, 黑龙江 哈尔滨, 150000
3.上海宇航系统工程研究所, 上海 201100
4.中国科学院 海西研究院 泉州装备制造研究中心, 福建 泉州 362000
[ "肖振久(1968—),男,内蒙古宁城人,硕士,副教授,2004年于辽宁工程技术大学获得硕士学位,主要从事图像与视觉信息计算、机器学习、网络与信息安全方面的研究。E-mail:xiaozhenjiu@lntu.edu.cn" ]
[ "郭杰龙(1988—),男,福建泉州人,硕士,工程师,2015年于中南民族大学获得硕士学位,主要从事机器学习、深度学习方面的研究。E-mail:gjl@fjirsm.ac.cn" ]
扫 描 看 全 文
肖振久, 赵昊泽, 张莉莉, 等. 基于自适应聚焦CRIoU损失的目标检测算法[J]. 液晶与显示, 2023,38(11):1468-1480.
XIAO Zhen-jiu, ZHAO Hao-ze, ZHANG Li-li, et al. Object detection algorithm based on adaptive focal CRIoU loss[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1468-1480.
肖振久, 赵昊泽, 张莉莉, 等. 基于自适应聚焦CRIoU损失的目标检测算法[J]. 液晶与显示, 2023,38(11):1468-1480. DOI: 10.37188/CJLCD.2023-0005.
XIAO Zhen-jiu, ZHAO Hao-ze, ZHANG Li-li, et al. Object detection algorithm based on adaptive focal CRIoU loss[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1468-1480. DOI: 10.37188/CJLCD.2023-0005.
在目标检测任务中,传统的边界框回归损失函数所回归的内容与评价标准IoU(Intersection over Union)之间存在不相关性,并且对于边界框的回归属性存在一定不合理性,使得回归属性不完整,降低了检测精度和收敛速度,甚至还会造成回归阻碍的情况。并且在回归任务中也存在样本不均衡的情况,大量的低质量样本影响了损失收敛。为了提高检测精度和回归收敛速度提出了一种新的边界框回归损失函数。首先确定设计思想并设计IoU系列损失函数的范式;其次在IoU损失的基础上引入两中心点形成矩形的周长和两框形成的最小闭包矩形周长的比值作为边界框中心点距离惩罚项,并且将改进的IoU损失应用到非极大值抑制(Non-Maximum Suppression,NMS)处理中。接着引入两框的宽高误差和最小外包框的宽高平方作为宽高惩罚项,确定CRIoU(Complete Relativity IoU,CRIoU)损失函数。最后在CRIoU的基础上加入自适应加权因子,对高质量样本的回归损失加权,定义了自适应聚焦CRIoU(Adaptive focal CRIoU,AF-CRIoU)。实验结果表明,使用AF-CRIoU损失函数对比传统非IoU系列损失的检测精度最高相对提升了8.52%,对比CIoU系列损失的检测精度最高相对提升了2.69%,使用A-CRIoU-NMS(Around CRIoU NMS)方法对比原NMS方法的检测精度提升0.14%。将AF-CRIoU损失应用到安全帽检测中,也达到了很好的检测效果。
In the object detection task, there is no correlation between the regression content of the traditional bounding box regression loss function and the evaluation standard IoU(Intersection over Union), and there is some irrationality for the regression attribute of the bounding box, which reduces the detection accuracy and convergence speed. In addition, the sample imbalance also exists in the regression task, and a large number of low-quality samples affect the loss function convergence. In this paper, a novel loss function, termed as CRIoU(Complete Relativity Intersection over Union),is proposed to improve the detection accuracy and convergence speed. Firstly, this work determines the design idea and determines the improved IoU loss function normal form. Secondly, on the basis of IoU loss, the ratio of the perimeter of the rectangle formed by the two center points and the minimum closure area formed by the two frames is introduced as the penalty term for the distance between the center points of the bounding box, and the improved IoU loss is applied to the non-maximum suppression.Then, the side error of the two frames and the side square of the minimum bounding box are introduced as the side penalty term,a novel loss function, termed as CRIoU(Complete Relativity Intersection over Union), is proposed. Finally, on the basis of CRIoU, an adaptive weighting factor is added to weight the regression loss of high-quality samples, and an AF-CRIoU (Adaptive focal CRIoU) is defined. The experimental results show that the detection accuracy of the AF-CRIoU loss function compared with the traditional non IoU series loss is up to 8.52%, the detection accuracy of the CIoU series loss is up to 2.69%, and the A-CRIoU NMS (Around CRIoU Non Maximum Suppression) method compared with the original NMS method is up to 0.14%. In addition, AF-CRIoU loss is applied to the detection of safety helmet, which also achieves good detection results.
目标检测边界框回归IoU损失函数非极大值抑制自适应聚焦损失
object detectionbounding box regressionIoU loss functionnon-maximum suppressionadaptive focal loss
张吉友,张荣芬,刘宇红,等. 基于注意力机制的多模态图像语义分割[J]. 液晶与显示,2023,38(7):975-984. doi: 10.37188/cjlcd.2022-0309http://dx.doi.org/10.37188/cjlcd.2022-0309
ZHANG J Y, ZHANG R F, LIU Y H, et al. Multimodal image semantic segmentation based on attention mechanism [J]. Chinese Journal of Liquid Crystals and Displays, 2023, 38(7): 975-984.(in Chinese). doi: 10.37188/cjlcd.2022-0309http://dx.doi.org/10.37188/cjlcd.2022-0309
张勇,郭杰龙,汪帆,等. 基于多级联递进卷积结构的图像去雨算法[J]. 液晶与显示,2023,38(10):1409-1422. doi: 10.37188/CJLCD.2022-0383http://dx.doi.org/10.37188/CJLCD.2022-0383
ZHANG Y, GUO J L, WANG F, et al. Image rain removal algorithm based on multi-cascade progressive convolution structure [J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(10): 1409-1422. (in Chinese). doi: 10.37188/CJLCD.2022-0383http://dx.doi.org/10.37188/CJLCD.2022-0383
CHEN L, AI H Z, ZHUANG Z J, et al. Real-time multiple people tracking with deeply learned candidate selection and person Re-identification [C]. IEEE International Conference on Multimedia and Expo. San Diego: IEEE, 2018: 1-6. doi: 10.1109/icme.2018.8486597http://dx.doi.org/10.1109/icme.2018.8486597
DONG J Y, ZHANG L Y, ZHANG H W, et al. Occlusion-aware GAN for face de-occlusion in the wild [C]. 2020 IEEE International Conference on Multimedia and Expo (ICME). London: IEEE, 2020: 1-6. doi: 10.1109/icme46284.2020.9102788http://dx.doi.org/10.1109/icme46284.2020.9102788
ZUO C, QIAN J M, FENG S J, et al. Deep learning in optical metrology: a review [J]. Light: Science & Applications, 2022, 11(1): 39. doi: 10.1038/s41377-022-00714-xhttp://dx.doi.org/10.1038/s41377-022-00714-x
SITU G. Deep holography [J]. Light: Advanced Manufacturing, 2022, 3(2): 278-300. doi: 10.37188/lam.2022.013http://dx.doi.org/10.37188/lam.2022.013
LUO Y, ZHAO Y F, LI J X, et al. Computational imaging without a computer: seeing through random diffusers at the speed of light [J]. eLight, 2022, 2(1): 4. doi: 10.1186/s43593-022-00012-4http://dx.doi.org/10.1186/s43593-022-00012-4
LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791http://dx.doi.org/10.1109/5.726791
SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440. doi: 10.1109/cvpr.2015.7298965http://dx.doi.org/10.1109/cvpr.2015.7298965
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2012: 1097-1105.
YANG Z T, SUN Y N, SHU L, et al. 3DSSD: Point-based 3D single stage object detector [C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11037-11045. doi: 10.1109/cvpr42600.2020.01105http://dx.doi.org/10.1109/cvpr42600.2020.01105
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587. doi: 10.1109/cvpr.2014.81http://dx.doi.org/10.1109/cvpr.2014.81
HEARST M A, DUMAIS S T, OSUNA E, et al. Support vector machines [J]. IEEE Intelligent Systems and their applications, 1998, 13(4): 18-28. doi: 10.1109/5254.708428http://dx.doi.org/10.1109/5254.708428
GIRSHICK R. Fast R-CNN [C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448. doi: 10.1109/iccv.2015.169http://dx.doi.org/10.1109/iccv.2015.169
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/tpami.2016.2577031http://dx.doi.org/10.1109/tpami.2016.2577031
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection [C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788. doi: 10.1109/cvpr.2016.91http://dx.doi.org/10.1109/cvpr.2016.91
REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolu: IEEE, 2017: 65117-6525. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
REDMON J, FARHADI A. YOLOv3: An incremental improvement [J/OL]. arXiv, 2018: 1804.02767. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection [J/OL]. arXiv: 2004.10934, 2020.
GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021 [J/OL]. arXiv, 2021: 2107.08430.
YU J H, JIANG Y N, WANG Z Y, et al. UnitBox: An advanced object detection network [C]//Proceedings of the 24th ACM International Conference on Multimedia. Amsterdam: ACM, 2016: 516-520. doi: 10.1145/2964284.2967274http://dx.doi.org/10.1145/2964284.2967274
REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: A metric and a loss for bounding box regression [C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 658-666. doi: 10.1109/cvpr.2019.00075http://dx.doi.org/10.1109/cvpr.2019.00075
ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: Faster and better learning for bounding box regression [C]//Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 2020: 12993-13000. doi: 10.1609/aaai.v34i07.6999http://dx.doi.org/10.1609/aaai.v34i07.6999
SERMANET P, EIGEN D, ZHANG X, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks [C]. 2nd International Conference on Learning Representations. Banff: ICIR, 2014.
LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector [C]. 14th European Conference on Computer Vision. Amsterdam: Springer, 2016: 21-37. doi: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007. doi: 10.1109/iccv.2017.324http://dx.doi.org/10.1109/iccv.2017.324
TAN M X, CHEN B, PANG R M, et al. MnasNet: Platform-aware neural architecture search for mobile [C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2815-2823. doi: 10.1109/cvpr.2019.00293http://dx.doi.org/10.1109/cvpr.2019.00293
0
浏览量
12
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构