基于YOLOv5和重识别的行人多目标跟踪方法

贺愉婷; 车进; 吴金蔓

doi:10.37188/CJLCD.2022-0025

您当前的位置：

首页 >

文章列表页 >

基于YOLOv5和重识别的行人多目标跟踪方法

图像处理 | 更新时间：2022-07-25

- 基于YOLOv5和重识别的行人多目标跟踪方法
- Pedestrian multi-target tracking method based on YOLOv5 and person re-identification
- 液晶与显示 2022年37卷第7期页码：880-890
- 作者机构：
  
  1.宁夏大学物理与电子电气工程学院，宁夏银川 750021
  2.宁夏沙漠信息智能感知重点实验室，宁夏银川 750021
- 作者简介：
  
  [ "贺愉婷（1998—），女，陕西榆林人，硕士研究生，2020年于西安邮电大学获得学士学位，主要从事基于深度学习的行人再识别及跟踪研究。E-mail：2356854359@qq.com" ]
  [ "车进（1973—），男，宁夏银川人，博士，教授，2014年于天津大学获得博士学位，主要从事图像处理、智能视频方面的研究。E-mail：koalache@126.com" ]
- 基金信息：
  
  国家自然科学基金(61861037)
- DOI：10.37188/CJLCD.2022-0025
  中图分类号： TP391
- 收稿日期：2022-01-24，
  
  修回日期：2022-02-11，
  
  纸质出版日期：2022-07-05
- 稿件说明：
移动端阅览
贺愉婷, 车进, 吴金蔓. 基于YOLOv5和重识别的行人多目标跟踪方法[J]. 液晶与显示, 2022,37(7):880-890.

Yu-ting HE, Jin CHE, Jin-man WU. Pedestrian multi-target tracking method based on YOLOv5 and person re-identification[J]. Chinese journal of liquid crystals and displays, 2022, 37(7): 880-890.
贺愉婷, 车进, 吴金蔓. 基于YOLOv5和重识别的行人多目标跟踪方法[J]. 液晶与显示, 2022,37(7):880-890. DOI： 10.37188/CJLCD.2022-0025.

Yu-ting HE, Jin CHE, Jin-man WU. Pedestrian multi-target tracking method based on YOLOv5 and person re-identification[J]. Chinese journal of liquid crystals and displays, 2022, 37(7): 880-890. DOI： 10.37188/CJLCD.2022-0025.

摘要

针对目前遵循基于检测的多目标跟踪范式存在的不足，本文以DeepSort为基础算法展开研究，以解决跟踪过程中因遮挡导致的目标ID频繁切换的问题。首先改进外观模型，将原始的宽残差网络更换为ResNeXt网络，在主干网络上引入卷积注意力机制，构造新的行人重识别网络，使模型更关注目标关键信息，提取更有效的特征；然后采用YOLOv5作为检测算法，加入检测层使得模型适应不同尺寸的目标，并在主干网络加入坐标注意力机制，进一步提升检测模型精度。在MOT16数据集上进行多目标跟踪实验，多目标跟踪准确率达到66.2%，多目标跟踪精确率达到80.8%，并满足实时跟踪的要求。

Abstract

Aiming at the shortcomings of current detection-based multi-target tracking paradigm， a research is conducted based on the algorithm of DeepSort to address the issue of frequent switching of targeted ID resulting from occlusion in tracking process. Firstly，focus should be placed on improving appearance model. Efforts should be made in replacing broadband and residual networks with ResNeXt networks， which introduces the mechanism for convolution attention into the backbone network and establish a new person re-identification network. In doing so， the model can pay more attention to critical information of targets and obtain effective features. Then， YOLOv5 serves as a detection algorithm. Adding detection layer enables the model to respond to targets of different sizes. Moreover， the mechanism for coordinate attention is introduced into the backbone networks. These efforts can further improve the accuracy of detection model. The multi-target tracking experiment is carried out on data sets of MOT16， the multi-target tracking accuracy rate is up to 66.2%， and the multi-target tracking precision ratio is up to 80.8%. All these can meet the needs of real-time tracking.

关键词

Keywords

references

LUO W H ， XING J L ， MILAN A ， et al . Multiple object tracking： a literature review ［J］. Artificial Intelligence ， 2021 ， 293 ： 103448 . doi: 10.1016/j.artint.2020.103448 http://dx.doi.org/10.1016/j.artint.2020.103448

罗浩，姜伟，范星，等 .‍ 基于深度学习的行人重识别研究进展［J］. 自动化学报， 2019 ， 45 （ 11 ）： 2032 - 2049 .

LUO H ， JIANG W ， FAN X ， et al . A survey on deep learning based person re-identification ［J］. Acta Automatica Sinica ， 2019 ， 45 （ 11 ）： 2032 - 2049 . （in Chinese）

BEWLEY A ， GE Z Y ， OTT L ， et al . Simple online and realtime tracking ［C］// Proceedings of 2016 IEEE International Conference on Image Processing （ICIP） . Phoenix， AZ， USA ： IEEE ， 2016 ： 3464 - 3468 . doi: 10.1109/icip.2016.7533003 http://dx.doi.org/10.1109/icip.2016.7533003

WOJKE N ， BEWLEY A ， PAULUS D . Simple online and realtime tracking with a deep association metric ［C］// Proceedings of 2017 IEEE International Conference on Image Processing （ICIP） . Beijing， China ： IEEE ， 2017 ： 3645 - 3649 . doi: 10.1109/icip.2017.8296962 http://dx.doi.org/10.1109/icip.2017.8296962

CHEN L ， AI H Z ， ZHUANG Z J ， et al . Real-time multiple people tracking with deeply learned candidate selection and person re-identification ［C］// Proceedings of 2018 IEEE International Conference on Multimedia and Expo （ICME） . San Diego， CA， USA ： IEEE ， 2018 ： 1 - 6 . doi: 10.1109/icme.2018.8486597 http://dx.doi.org/10.1109/icme.2018.8486597

WANG Z D ， ZHENG L ， LIU Y X ， et al . Towards real-time multi-object tracking ［M］//VEDALDI A， BISCHOF H， BROX T， et al. Computer Vision-ECCV 2020 ， Cham ： Springer ， 2020 . doi: 10.1007/978-3-030-58621-8_7 http://dx.doi.org/10.1007/978-3-030-58621-8_7

ZHANG Y F ， WANG C Y ， WANG X G ， et al . FairMOT： on the fairness of detection and re-identification in multiple object tracking ［J］. International Journal of Computer Vision ， 2021 ， 129 （ 11 ）： 3069 - 3087 . doi: 10.1007/s11263-021-01513-4 http://dx.doi.org/10.1007/s11263-021-01513-4

DUAN K W ， SONG B ， XIE L X ， et al . CenterNet： keypoint triplets for object detection ［C］// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision （ICCV） . Seoul， Korea （South）： IEEE ， 2019 ： 6568 - 6577 . doi: 10.1109/iccv.2019.00667 http://dx.doi.org/10.1109/iccv.2019.00667

XIE S N ， GIRSHICK R ， DOLLÁR P ， et al . Aggregated residual transformations for deep neural networks ［C］// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . Honolulu， HI， USA ： IEEE ， 2017 ： 5987 - 5995 . doi: 10.1109/cvpr.2017.634 http://dx.doi.org/10.1109/cvpr.2017.634

HE K M ， ZHANG X Y ， REN S Q ， et al . Deep residual learning for image recognition ［C］// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . Las Vegas， NV， USA ： IEEE ， 2016 ： 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90

ZHENG Z D ， LIANG Z ， YI Y . Unlabeled samples generated by GAN improve the person re-identification baseline in vitro ［C］// Proceedings of 2017 IEEE International Conference on Computer Vision （ICCV） . Venice， Italy ： IEEE ， 2017 ： 3774 - 3782 . doi: 10.1109/iccv.2017.405 http://dx.doi.org/10.1109/iccv.2017.405

ZHONG Z ， ZHENG L ， ZHENG Z D ， et al . Camera style adaptation for person re-identification ［C］// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， UT， USA ： IEEE ， 2018 ： 5157 - 5166 . doi: 10.1109/cvpr.2018.00541 http://dx.doi.org/10.1109/cvpr.2018.00541

WOO S ， PARK J ， LEE J Y ， et al . CBAM： convolutional block attention module ［C］// Proceedings of the 15th European Conference on Computer Vision . Munich . Germany ： Springer ， 2018 ： 3 - 19 . doi: 10.1007/978-3-030-01234-2_1 http://dx.doi.org/10.1007/978-3-030-01234-2_1

李天宇，李栋，陈明举，等 . 一种高精度的卷积神经网络安全帽检测方法［J］. 液晶与显示， 2021 ， 36 （ 7 ）： 1018 - 1026 . doi: 10.37188/CJLCD.2020-0309 http://dx.doi.org/10.37188/CJLCD.2020-0309

LI T Y ， LI D ， CHEN M J ， et al . High precision detection method of safety helmet based on convolution neural network ［J］. Chinese Journal of Liquid Crystals and Displays ， 2021 ， 36 （ 7 ）： 1018 - 1026 . （in Chinese） . doi: 10.37188/CJLCD.2020-0309 http://dx.doi.org/10.37188/CJLCD.2020-0309

赵睿，刘辉，刘沛霖，等 . 基于改进YOLOv5s的安全帽检测算法［J/OL］. 北京航空航天大学学报： 1 - 16 ［ 2022-01-12 ］. https：//kns.cnki.net/kcms/detail/detail.aspx？FileName=BJHK20211120004&DbName=CAPJ2021 https://kns.cnki.net/kcms/detail/detail.aspx?FileName=BJHK20211120004&DbName=CAPJ2021 .

ZHAO R ， LIU H ， LIU P L ， et al . Research on safety helmet detection algorithm based on improved YOLOv 5 s ［J/OL］. Journal of Beijing University of Aeronautics and Astronautics ： 1 - 16 ［ 2022-01-12 ］. https：//kns.cnki.net/kcms/detail/detail.aspx？FileName=BJHK20211120004&DbName=CAPJ2021. https://kns.cnki.net/kcms/detail/detail.aspx?FileName=BJHK20211120004&DbName=CAPJ2021. （in Chinese）

李永上，马荣贵，张美月 . 改进YOLOv5s+DeepSORT的监控视频车流量统计［J］. 计算机工程与应用， 2020 ， 58 （ 5 ）： 271 - 2791 .

LI Y S ， MA R G ， ZHANG M Y . Traffic monitoring video vehicle volume statistics method based on improved YOLOv5s+DeepSORT ［J］. Computer Engineering and Applications ， 2020 ， 58 （ 5 ）： 271 - 279 . （in Chinese）

GUO M H ， XU T X ， LIU J J ， et al . Attention mechanisms in computer vision： a survey ［EB/OL］. （ 2021-11-15 ）［ 2022-01-12 ］. https：//arxiv.org/abs/2111.07624 https://arxiv.org/abs/2111.07624 . doi: 10.1007/s41095-022-0271-y http://dx.doi.org/10.1007/s41095-022-0271-y

HU J ， SHEN L ， ALBANIE S ， et al . Squeeze-and-excitation networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2020 ， 42 （ 8 ）： 2011 - 2023 . doi: 10.1109/tpami.2019.2913372 http://dx.doi.org/10.1109/tpami.2019.2913372

WANG Q L ， BANG G W ， ZHU P F ， et al . ECA-Net： efficient channel attention for deep convolutional neural networks ［C］// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR） . Seattle， WA， USA ： IEEE ， 2020 ： 11531 - 11539 . doi: 10.1109/cvpr42600.2020.01155 http://dx.doi.org/10.1109/cvpr42600.2020.01155

HOU Q B ， ZHOU D Q ， FENG J S . Coordinate attention for efficient mobile network design ［C］// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR） . Nashville， TN， USA ： IEEE ， 2021 ： 13708 - 13717 . doi: 10.1109/cvpr46437.2021.01350 http://dx.doi.org/10.1109/cvpr46437.2021.01350

MILAN A ， LEAL-TAIXE L ， REID I ， et al . MOT16： a benchmark for multi-object tracking ［EB/OL］. （ 2016-03-02 ）［ 2022-01-12 ］. https：//arxiv.org/abs/1603.00831v2 https://arxiv.org/abs/1603.00831v2 . doi: 10.1109/cvpr.2015.7299178 http://dx.doi.org/10.1109/cvpr.2015.7299178

邹北骥，李伯洲，刘姝 . 基于中心点检测和重识别的多行人跟踪算法［J］. 武汉大学学报（信息科学版）， 2021 ， 46 （ 9 ）： 1345 - 1353 .

ZOU B J ， LI B Z ， LIU S . A multi-pedestrian tracking algorithm based on center point detection and person re-identification ［J］. Geomatics and Information Science of Wuhan University ， 2021 ， 46 （ 9 ）： 1345 - 1353 . （in Chinese）

YU F W ， LI W B ， LI Q Q ， et al . POI： multiple object tracking with high performance detection and appearance feature ［M］//HUA G， JÉGOU H. Computer Vision-ECCV 2016 Workshops ， Cham ： Springer ， 2016 . doi: 10.1007/978-3-319-48881-3_3 http://dx.doi.org/10.1007/978-3-319-48881-3_3

PANG B ， LI Y Z ， ZHANG Y F ， et al . TubeTK： adopting tubes to track multi-object in a one-step training model ［C］// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR） . Seattle， WA， USA ： IEEE ， 2020 ： 6307 - 6317 . doi: 10.1109/cvpr42600.2020.00634 http://dx.doi.org/10.1109/cvpr42600.2020.00634

浏览量

421

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于自适应遮挡恢复与拓扑姿态双向感知的三维手部重建方法

基于改进YOLOv8n-Pose的疲劳驾驶检测

基于改进ShuffleNetV2网络的遥感场景分类模型

基于卷积注意力机制的无透镜对抗编码成像

基于多特征融合的人像HDR图像处理算法