Lightweight and high-precision object detection algorithm based on YOLO framework

FAN Xin-chuan; CHEN Chun-mei

doi:10.37188/CJLCD.2022-0328

您当前的位置：

首页 >

文章列表页 >

Lightweight and high-precision object detection algorithm based on YOLO framework

Image Processing | 更新时间：2023-07-06

- Lightweight and high-precision object detection algorithm based on YOLO framework
- Chinese Journal of Liquid Crystals and Displays Vol. 38, Issue 7, Pages: 945-954(2023)
- 作者机构：
  
  西南科技大学信息工程学院，四川绵阳 621010
- 作者简介：
- 基金信息：
  
  Doctoral Fund of Southwest University of Science and Technology(20ZX7123);Project of Sichuan Health and Family Planning Department(17PJ207)
- DOI：10.37188/CJLCD.2022-0328
  CLC： TP391.4
- Received：13 November 2022，
  
  Revised：14 December 2022，
  
  Published：05 July 2023
- 稿件说明：
移动端阅览
FAN Xin-chuan, CHEN Chun-mei. Lightweight and high-precision object detection algorithm based on YOLO framework[J]. Chinese journal of liquid crystals and displays, 2023, 38(7): 945-954.
DOI：

FAN Xin-chuan, CHEN Chun-mei. Lightweight and high-precision object detection algorithm based on YOLO framework[J]. Chinese journal of liquid crystals and displays, 2023, 38(7): 945-954. DOI： 10.37188/CJLCD.2022-0328.

摘要

面对图像的多尺度目标检测算法存在检测精度和系统开销相互制约的问题，提出了一种基于YOLO框架的轻量化高精度目标检测算法。在YOLO框架下，以MobileNetv3网络为基础，改进下采样和通道注意力机制，以精准提取目标特征，降低了不必要开销。设计了特征金字塔和单阶段无头融合的结构，构建了不同感受野以获取不同尺度信息，增强了算法对多尺度目标的适应性。同时采用SIOU作为回归损失和Soft-NMS进行冗余框处理，提高了算法的精度。在MS COCO数据集和UA-DETRAC交通监控数据集上的实验结果表明，本文提出的改进算法与原YOLOXs相比，在不降低精度的情况下，模型参数量和计算量分别降低了64.98%、57.14%；在UA-DETRAC交通数据集上，mAP@0.5达到70.5%，提升了3.52%，FPS提升了14.4%。本文改进算法大幅降低了系统开销，提升了精度，有效保障了检测的双重性能。

Abstract

Image-oriented multi-scale object detection algorithms often have the problem of mutual restriction between detection accuracy and system cost. Therefore， a lightweight and high-precision object detection algorithm based on YOLO framework is proposed. Under the YOLO framework， the mechanism of down-sampling and channel attention based on MobileNetv3 network is improved to accurately extract target features and reduce unnecessary overhead. The feature pyramid and single-stage headless fusion structure are designed， and different receptive fields are constructed to obtain different scale information， so as to enhance the adaptability of the algorithm for multi-scale targets. At the same time， SIOU is used as regression loss and Soft-NMS is used for redundant frame processing to improve the accuracy of the algorithm. Experiments are conducted on the MS COCO and UA-DETRAC. Compared with the original YOLOXs， the results show that the proposed improved algorithm reduces the number of model parameters and the computational cost reduced by 64.98% and 57.14% without reducing the accuracy. On the UA-DETRAC， mAP@0.5 reaches 70.5% which is improved by 3.52%， and FPS increases by 14.4%. The experimental results show that our algorithm greatly reduces the system overhead， improves the accuracy， and effectively guarantees the dual performance of detection.

关键词

Keywords

references

吴海滨，魏喜盈，刘美红，等 . 结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测［J］. 中国光学， 2021 ， 14 （ 6 ）： 1417 - 1425 . doi: 10.37188/CO.2021-0078 http://dx.doi.org/10.37188/CO.2021-0078

WU H B ， WEI X Y ， LIU M H ， et al . Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning ［J］. Chinese Optics ， 2021 ， 14 （ 6 ）： 1417 - 1425 . （in Chinese） . doi: 10.37188/CO.2021-0078 http://dx.doi.org/10.37188/CO.2021-0078

周波，李俊峰 . 结合目标检测的人体行为识别［J］. 自动化学报， 2020 ， 46 （ 9 ）： 1961 - 1970 . doi: 10.16383/j.aas.c180848 http://dx.doi.org/10.16383/j.aas.c180848

ZHOU B ， LI J F . Human action recognition combined with object detection ［J］. Acta Automatica Sinica ， 2020 ， 46 （ 9 ）： 1961 - 1970 . （in Chinese） . doi: 10.16383/j.aas.c180848 http://dx.doi.org/10.16383/j.aas.c180848

宋锐，施智平，渠瀛，等 . 基于深度残差学习的自动驾驶道路场景理解［J］. 计算机应用研究， 2019 ， 36 （ 9 ）： 2825 - 2829， 2871 .

SONG R ， SHI Z P ， QU Y ， et al . Road scene understanding for autonomous driving via deep residual learning ［J］. Application Research of Computers ， 2019 ， 36 （ 9 ）： 2825 - 2829， 2871 . （in Chinese）

贺愉婷，车进，吴金蔓 . 基于YOLOv5和重识别的行人多目标跟踪方法［J］. 液晶与显示， 2022 ， 37 （ 7 ）： 880 - 890 . doi: 10.37188/CJLCD.2022-0025 http://dx.doi.org/10.37188/CJLCD.2022-0025

HE Y T ， CHE J ， WU J M . Pedestrian multi-target tracking method based on YOLOv5 and person re-identification ［J］. Chinese Journal of Liquid Crystals and Displays ， 2022 ， 37 （ 7 ）： 880 - 890 . （in Chinese） . doi: 10.37188/CJLCD.2022-0025 http://dx.doi.org/10.37188/CJLCD.2022-0025

刘博，韩广良，罗惠元 . 基于多尺度细节的孪生卷积神经网络图像融合算法［J］. 液晶与显示， 2021 ， 36 （ 9 ）： 1283 - 1293 . doi: 10.37188/CJLCD.2020-0339 http://dx.doi.org/10.37188/CJLCD.2020-0339

LIU B ， HAN G L ， LUO H Y . Image fusion algorithm based on multi-scale detail siamese convolutional neural network ［J］. Chinese Journal of Liquid Crystals and Displays ， 2021 ， 36 （ 9 ）： 1283 - 1293 . （in Chinese） . doi: 10.37188/CJLCD.2020-0339 http://dx.doi.org/10.37188/CJLCD.2020-0339

REN S Q ， HE K M ， GIRSHICK R ， et al . Faster R-CNN： towards real-time object detection with region proposal networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2017 ， 39 （ 6 ）： 1137 - 1149 . doi: 10.1109/tpami.2016.2577031 http://dx.doi.org/10.1109/tpami.2016.2577031

HE K M ， GKIOXARI G ， DOLLÁR P ， et al . Mask R-CNN ［C］// Proceedings of 2017 IEEE International Conference on Computer Vision . Venice， Italy ： IEEE ， 2017 ： 2980 - 2988 . doi: 10.1109/iccv.2017.322 http://dx.doi.org/10.1109/iccv.2017.322

REDMON J ， FARHADI A . YOLO9000： better， faster， stronger ［C］// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu， USA ： IEEE ， 2017 ： 6517 - 6525 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690

REDMON J ， FARHADI A . YOLOv3： an incremental improvement ［J/OL］. arXiv ， 2018 ： 1804 . 02767 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690

BOCHKOVSKIY A ， WANG C Y ， LIAO H Y M . YOLOV4： optimal speed and accuracy of object detection ［J/OL］. arXiv ， 2020 ： 2004 . 10934 .

LAW H ， DENG J . CornerNet： detecting objects as paired keypoints ［J］. International Journal of Computer Vision ， 2020 ， 128 （ 3 ）： 642 - 656 . doi: 10.1007/s11263-019-01204-1 http://dx.doi.org/10.1007/s11263-019-01204-1

ZHOU X Y ， WANG D Q ， KRÄHENBÜHL P . Objects as points ［J/OL］. arXiv ， 2019 ： 1904 . 07850 . doi: 10.1090/mbk/121/79 http://dx.doi.org/10.1090/mbk/121/79

GE Z ， LIU S T ， WANG F ， et al . YOLOX： exceeding YOLO series in 2021 ［J/OL］. arXiv ， 2021 ： 2107 . 08430 .

HOWARD A G ， ZHU M L ， CHEN B ， et al . MobileNets： efficient convolutional neural networks for mobile vision applications ［J/OL］. arXiv ， 2017 ： 1704 . 04861 .

ZHANG X Y ， ZHOU X Y ， LIN M X ， et al . ShuffleNet： an extremely efficient convolutional neural network for mobile devices ［C］// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， USA ： IEEE ， 2018 ： 6848 - 6856 . doi: 10.1109/cvpr.2018.00716 http://dx.doi.org/10.1109/cvpr.2018.00716

HAN K ， WANG Y H ， TIAN Q ， et al . GhostNet： more features from cheap operations ［C］// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle， USA ： IEEE ， 2020 ： 1577 - 1586 . doi: 10.1109/cvpr42600.2020.00165 http://dx.doi.org/10.1109/cvpr42600.2020.00165

HOWARD A ， SANDLER M ， CHEN B ， et al . Searching for MobileNetv3 ［C］// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision . Seoul， Korea （South）： IEEE ， 2019 ： 1314 - 1324 . doi: 10.1109/iccv.2019.00140 http://dx.doi.org/10.1109/iccv.2019.00140

HU J ， SHEN L ， SUN G . Squeeze-and-excitation networks ［C］// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， USA ： IEEE ， 2018 ： 7132 - 7141 . doi: 10.1109/cvpr.2018.00745 http://dx.doi.org/10.1109/cvpr.2018.00745

WANG Q L ， WU B G ， ZHU P F ， et al . ECA-Net： efficient channel attention for deep convolutional neural networks ［C］// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle， USA ： IEEE ， 2020 ： 11531 - 11539 . doi: 10.1109/cvpr42600.2020.01155 http://dx.doi.org/10.1109/cvpr42600.2020.01155

LIN T Y ， DOLLÁR P ， GIRSHICK R ， et al . Feature pyramid networks for object detection ［C］// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu， USA ： IEEE ， 2017 ： 936 - 944 . doi: 10.1109/cvpr.2017.106 http://dx.doi.org/10.1109/cvpr.2017.106

SZEGEDY C ， VANHOUCKE V ， IOFFE S ， et al . Rethinking the inception architecture for computer vision ［C］// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas， USA ： IEEE ， 2016 ： 2818 - 2826 . doi: 10.1109/cvpr.2016.308 http://dx.doi.org/10.1109/cvpr.2016.308

DENG J K ， GUO J ， VERVERAS E ， et al . RetinaFace： single-shot multi-level face localisation in the wild ［C］// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle， USA ： IEEE ， 2020 ： 5202 - 5211 . doi: 10.1109/cvpr42600.2020.00525 http://dx.doi.org/10.1109/cvpr42600.2020.00525

ZHENG Z H ， WANG P ， LIU W ， et al . Distance-IoU loss： faster and better learning for bounding box regression ［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence . New York， USA ： AAAI ， 2020 ： 12993 - 13000 . doi: 10.1609/aaai.v34i07.6999 http://dx.doi.org/10.1609/aaai.v34i07.6999

BODLA N ， SINGH B ， CHELLAPPA R ， et al . Soft-NMS-improving object detection with one line of code ［C］// Proceedings of 2017 IEEE International Conference on Computer Vision . Venice， Italy ： IEEE ， 2017 ： 5562 - 5570 . doi: 10.1109/iccv.2017.593 http://dx.doi.org/10.1109/iccv.2017.593

GEVORGYAN Z . SIoU loss： more powerful learning for bounding box regression ［J/OL］. arXiv ， 2022 ： 2205 . 12740 .

WOO S ， PARK J ， LEE J Y ， et al . CBAM： convolutional block attention module ［C］// Proceedings of the 15th European Conference on Computer Vision . Munich， Germany ： Springer ， 2018 ： 3 - 19 . doi: 10.1007/978-3-030-01234-2_1 http://dx.doi.org/10.1007/978-3-030-01234-2_1

LIU S ， QI L ， QIN H F ， et al . 2018 . Path aggregation network for instance segmentation ［C］// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， USA ： IEEE ， 2018： 8759 - 8768 . doi: 10.1109/cvpr.2018.00913 http://dx.doi.org/10.1109/cvpr.2018.00913

Views

241

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Improved YOLOv5 lightweight binocular vision UAV obstacle avoidance algorithm based on Ghost module

Mura defect detection of LCD screen based on improved YOLOv8n

Lightweight method of feature point extraction and matching incorporating a progressive strategy

Related Author

JIA Yifan

CAO Tianyi

BAI Yue

JIA Yi-fan

CAO Tian-yi

CHEN Shunlong

LIAO Yinghua

LIN Feng

Related Institution

Changchun Institute of Optics， Fine Mechanics and Physics， Chinese Academy of Sciences

University of Chinese Academy of Sciences

SWJTU-Leeds Joint School

SWJTU-Leeds Joint school

School of Mechanical Engineering， Sichuan University of Science & Engineering， Yinbin

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China 130033 Postal code：130033
Tel：0431-86176059 Email：yjxs@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰