1.西南科技大学 信息工程学院, 四川 绵阳 621010
[ "樊新川(1997—),男,四川内江人,硕士研究生,2016年于西南科技大学获得学士学位,主要从事目标检测、多目标跟踪的研究。E-mail:984032052@qq.com" ]
[ "陈春梅(1977—),女,四川绵阳,博士,副教授,2020年于中国工程物理研究院获得博士学位,主要从事机器视觉技术、计算机应用软件开发技术、网络通信与路由技术、移动互联网与信息处理技术以及多媒体处理技术的研究。E-mail:47920787@qq.com" ]
扫 描 看 全 文
樊新川, 陈春梅. 基于YOLO框架的轻量化高精度目标检测算法[J]. 液晶与显示, 2023,38(7):945-954.
FAN Xin-chuan, CHEN Chun-mei. Lightweight and high-precision object detection algorithm based on YOLO framework[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(7):945-954.
樊新川, 陈春梅. 基于YOLO框架的轻量化高精度目标检测算法[J]. 液晶与显示, 2023,38(7):945-954. DOI: 10.37188/CJLCD.2022-0328.
FAN Xin-chuan, CHEN Chun-mei. Lightweight and high-precision object detection algorithm based on YOLO framework[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(7):945-954. DOI: 10.37188/CJLCD.2022-0328.
面对图像的多尺度目标检测算法存在检测精度和系统开销相互制约的问题,提出了一种基于YOLO框架的轻量化高精度目标检测算法。在YOLO框架下,以MobileNetv3网络为基础,改进下采样和通道注意力机制,以精准提取目标特征,降低了不必要开销。设计了特征金字塔和单阶段无头融合的结构,构建了不同感受野以获取不同尺度信息,增强了算法对多尺度目标的适应性。同时采用SIOU作为回归损失和Soft-NMS进行冗余框处理,提高了算法的精度。在MS COCO数据集和UA-DETRAC交通监控数据集上的实验结果表明,本文提出的改进算法与原YOLOXs相比,在不降低精度的情况下,模型参数量和计算量分别降低了64.98%、57.14%;在UA-DETRAC交通数据集上,mAP@0.5达到70.5%,提升了3.52%,FPS提升了14.4%。本文改进算法大幅降低了系统开销,提升了精度,有效保障了检测的双重性能。
Image-oriented multi-scale object detection algorithms often have the problem of mutual restriction between detection accuracy and system cost. Therefore, a lightweight and high-precision object detection algorithm based on YOLO framework is proposed. Under the YOLO framework, the mechanism of down-sampling and channel attention based on MobileNetv3 network is improved to accurately extract target features and reduce unnecessary overhead. The feature pyramid and single-stage headless fusion structure are designed, and different receptive fields are constructed to obtain different scale information, so as to enhance the adaptability of the algorithm for multi-scale targets. At the same time, SIOU is used as regression loss and Soft-NMS is used for redundant frame processing to improve the accuracy of the algorithm. Experiments are conducted on the MS COCO and UA-DETRAC. Compared with the original YOLOXs, the results show that the proposed improved algorithm reduces the number of model parameters and the computational cost reduced by 64.98% and 57.14% without reducing the accuracy. On the UA-DETRAC, mAP@0.5 reaches 70.5% which is improved by 3.52%, and FPS increases by 14.4%. The experimental results show that our algorithm greatly reduces the system overhead, improves the accuracy, and effectively guarantees the dual performance of detection.
目标检测YOLOXs注意力机制轻量化SIOUSoft-NMS
object detectionYOLOXsmechanism of attentionlightweightSIOUSoft-NMS
吴海滨,魏喜盈,刘美红,等.结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测[J].中国光学,2021,14(6):1417-1425. doi: 10.37188/CO.2021-0078http://dx.doi.org/10.37188/CO.2021-0078
WU H B, WEI X Y, LIU M H, et al. Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning [J]. Chinese Optics, 2021, 14(6): 1417-1425. (in Chinese). doi: 10.37188/CO.2021-0078http://dx.doi.org/10.37188/CO.2021-0078
周波,李俊峰.结合目标检测的人体行为识别[J].自动化学报,2020,46(9):1961-1970. doi: 10.16383/j.aas.c180848http://dx.doi.org/10.16383/j.aas.c180848
ZHOU B, LI J F. Human action recognition combined with object detection [J]. Acta Automatica Sinica, 2020, 46(9): 1961-1970. (in Chinese). doi: 10.16383/j.aas.c180848http://dx.doi.org/10.16383/j.aas.c180848
宋锐,施智平,渠瀛,等.基于深度残差学习的自动驾驶道路场景理解[J].计算机应用研究,2019,36(9):2825-2829, 2871.
SONG R, SHI Z P, QU Y, et al. Road scene understanding for autonomous driving via deep residual learning [J]. Application Research of Computers, 2019, 36(9): 2825-2829, 2871. (in Chinese)
贺愉婷,车进,吴金蔓.基于YOLOv5和重识别的行人多目标跟踪方法[J].液晶与显示,2022,37(7):880-890. doi: 10.37188/CJLCD.2022-0025http://dx.doi.org/10.37188/CJLCD.2022-0025
HE Y T, CHE J, WU J M. Pedestrian multi-target tracking method based on YOLOv5 and person re-identification [J]. Chinese Journal of Liquid Crystals and Displays, 2022, 37(7): 880-890. (in Chinese). doi: 10.37188/CJLCD.2022-0025http://dx.doi.org/10.37188/CJLCD.2022-0025
刘博,韩广良,罗惠元.基于多尺度细节的孪生卷积神经网络图像融合算法[J].液晶与显示,2021,36(9):1283-1293. doi: 10.37188/CJLCD.2020-0339http://dx.doi.org/10.37188/CJLCD.2020-0339
LIU B, HAN G L, LUO H Y. Image fusion algorithm based on multi-scale detail siamese convolutional neural network [J]. Chinese Journal of Liquid Crystals and Displays, 2021, 36(9): 1283-1293. (in Chinese). doi: 10.37188/CJLCD.2020-0339http://dx.doi.org/10.37188/CJLCD.2020-0339
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/tpami.2016.2577031http://dx.doi.org/10.1109/tpami.2016.2577031
HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2980-2988. doi: 10.1109/iccv.2017.322http://dx.doi.org/10.1109/iccv.2017.322
REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 6517-6525. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
REDMON J, FARHADI A. YOLOv3: an incremental improvement [J/OL]. arXiv, 2018: 1804.02767. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOV4: optimal speed and accuracy of object detection [J/OL]. arXiv, 2020: 2004.10934.
LAW H, DENG J. CornerNet: detecting objects as paired keypoints [J]. International Journal of Computer Vision, 2020, 128(3): 642-656. doi: 10.1007/s11263-019-01204-1http://dx.doi.org/10.1007/s11263-019-01204-1
ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points [J/OL]. arXiv, 2019: 1904.07850. doi: 10.1090/mbk/121/79http://dx.doi.org/10.1090/mbk/121/79
GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021 [J/OL]. arXiv, 2021: 2107.08430.
HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [J/OL]. arXiv, 2017: 1704.04861.
ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 6848-6856. doi: 10.1109/cvpr.2018.00716http://dx.doi.org/10.1109/cvpr.2018.00716
HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 1577-1586. doi: 10.1109/cvpr42600.2020.00165http://dx.doi.org/10.1109/cvpr42600.2020.00165
HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetv3 [C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 2019: 1314-1324. doi: 10.1109/iccv.2019.00140http://dx.doi.org/10.1109/iccv.2019.00140
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 7132-7141. doi: 10.1109/cvpr.2018.00745http://dx.doi.org/10.1109/cvpr.2018.00745
WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 11531-11539. doi: 10.1109/cvpr42600.2020.01155http://dx.doi.org/10.1109/cvpr42600.2020.01155
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 936-944. doi: 10.1109/cvpr.2017.106http://dx.doi.org/10.1109/cvpr.2017.106
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 2818-2826. doi: 10.1109/cvpr.2016.308http://dx.doi.org/10.1109/cvpr.2016.308
DENG J K, GUO J, VERVERAS E, et al. RetinaFace: single-shot multi-level face localisation in the wild [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 5202-5211. doi: 10.1109/cvpr42600.2020.00525http://dx.doi.org/10.1109/cvpr42600.2020.00525
ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020: 12993-13000. doi: 10.1609/aaai.v34i07.6999http://dx.doi.org/10.1609/aaai.v34i07.6999
BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS-improving object detection with one line of code [C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 5562-5570. doi: 10.1109/iccv.2017.593http://dx.doi.org/10.1109/iccv.2017.593
GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression [J/OL]. arXiv, 2022: 2205. 12740.
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018: 3-19. doi: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1
LIU S, QI L, QIN H F, et al. 2018. Path aggregation network for instance segmentation [C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 8759-8768. doi: 10.1109/cvpr.2018.00913http://dx.doi.org/10.1109/cvpr.2018.00913
0
浏览量
91
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构