1.武汉科技大学 信息科学与工程学院, 湖北 武汉 430081
2.武汉科技大学 教育部冶金自动化与检测技术工程研究中心, 湖北 武汉 430081
[ "王名赫(1994—),男,湖北襄阳人,硕士研究生,2018年于西安工程大学获得学士学位,主要从事图像处理、目标检测、人体姿态估计、深度学习等方面的研究。E-mail:448842599@qq.com" ]
[ "徐望明(1979—),男,湖北武汉人,博士,高级工程师,正高级实验师,2013年于武汉科技大学获得博士学位,主要从事图像处理与模式识别等方面的研究。E-mail:xuwangming@wust.edu.cn" ]
扫 描 看 全 文
王名赫, 徐望明, 蒋昊坤. 一种改进的轻量级人体姿态估计算法[J]. 液晶与显示, 2023,38(7):955-963.
WANG Ming-he, XU Wang-ming, JIANG Hao-kun. Improved lightweight human pose estimation algorithm[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(7):955-963.
王名赫, 徐望明, 蒋昊坤. 一种改进的轻量级人体姿态估计算法[J]. 液晶与显示, 2023,38(7):955-963. DOI: 10.37188/CJLCD.2022-0323.
WANG Ming-he, XU Wang-ming, JIANG Hao-kun. Improved lightweight human pose estimation algorithm[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(7):955-963. DOI: 10.37188/CJLCD.2022-0323.
现有的多数人体姿态估计算法通过设计复杂的网络结构以获得高精度而导致速度较低。YOLO-Pose人体姿态估计算法吸收了先进目标检测算法的优点同时获得了较高的精度和速度,然而仍然存在漏检和误检问题。本文对YOLO-Pose算法进一步改进,针对人体姿态非刚性和人体关键点分布多样性的特点提出一种新的轻量级人体姿态检测算法。首先,设计了轻量级通道和空间注意力网络LCSA-Net以提升模型的特征提取能力;其次,采用了基于距离自适应的加权策略在模型训练时计算人体关键点回归损失以增强模型对远距离人体关键点的回归能力。在COCO 2017人体姿态数据集上的实验结果表明,与基准模型相比,两种改进策略均有效提升了人体姿态估计性能,实现了2%的mAP提升、1.5%的AP50提升和1.7%的AR提升。
Most of the existing human pose estimation algorithms have designed complex network structures to obtain high accuracy but lead to low speed. The YOLO-Pose algorithm has taken advantages of state-of-the-art object detection algorithm and obtained higher accuracy and speed, but it still has the problems of missed detection and false detection. In this paper, a new lightweight human pose estimation algorithm is proposed according to the characteristics of non-rigidness of human poses and the distribution diversity of human landmarks so as to improve the YOLO-Pose algorithm. Firstly, the lightweight channel and spatial attention network (LCSA-Net) are designed to enhance the feature extraction capability. Secondly, a distance-based adaptive weighting strategy is presented to calculate the regression loss of human landmarks during model training so as to enhance the regression ability of the model to long-distance human landmarks. The experimental results on the COCO 2017 human pose dataset indicate that both of the improved strategies can effectively promote the performance of human pose estimation compared with the baseline model, and achieves improvement of 2% mAP, 1.5% AP50 and 1.7% AR.
人体姿态估计YOLO-Pose注意力网络自适应加权回归损失
human pose estimationYOLO-Poseattention netadaptive weightingregression loss
杨文俊.康复训练场景下人体姿态识别研究[D].广州:广东工业大学,2022. doi: 10.1016/j.patcog.2022.108652http://dx.doi.org/10.1016/j.patcog.2022.108652
YANG W J. Research on human pose recognition in rehabilitation training scene [D]. Guangzhou: Guangdong University of Technology, 2022. (in Chinese). doi: 10.1016/j.patcog.2022.108652http://dx.doi.org/10.1016/j.patcog.2022.108652
陈仁文,袁婷婷,黄文斌,等.卷积神经网络在驾驶员姿态估计上的应用[J].光学 精密工程,2021,29(4):813-821. doi: 10.37188/OPE.20212904.0813http://dx.doi.org/10.37188/OPE.20212904.0813
CHEN R W, YUAN T T, HUANG W B, et al. Driver pose estimation using convolutional neural networks [J]. Optics and Precision Engineering, 2021, 29(4): 813-821. (in Chinese). doi: 10.37188/OPE.20212904.0813http://dx.doi.org/10.37188/OPE.20212904.0813
PALERMO M, CERQUEIRA S M, ANDRÉ J, et al. From raw measurements to human pose-a dataset with low-cost and high-end inertial-magnetic sensor data [J]. Scientific Data, 2022, 9(1): 591. doi: 10.1038/s41597-022-01690-yhttp://dx.doi.org/10.1038/s41597-022-01690-y
田元,李方迪.基于深度信息的人体姿态识别研究综述[J].计算机工程与应用,2020,56(4):1-8. doi: 10.3778/j.issn.1002-8331.1910-0445http://dx.doi.org/10.3778/j.issn.1002-8331.1910-0445
TIAN Y, LI F D. Research review on human body gesture recognition based on depth data [J]. Computer Engineering and Applications, 2020, 56(4): 1-8. (in Chinese). doi: 10.3778/j.issn.1002-8331.1910-0445http://dx.doi.org/10.3778/j.issn.1002-8331.1910-0445
CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields [C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 7291-7299. doi: 10.1109/cvpr.2017.143http://dx.doi.org/10.1109/cvpr.2017.143
NEWELL A, HUANG Z A, DENG J. Associative embedding: end-to-end learning for joint detection and grouping [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA: Curran Associates Inc., 2017.
CHENG B W, XIAO B, WANG J D, et al. HigherHRNet: scale-aware representation learning for bottom-up human pose estimation [C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 5385-5394. doi: 10.1109/cvpr42600.2020.00543http://dx.doi.org/10.1109/cvpr42600.2020.00543
GENG Z G, SUN K, XIAO B, et al. Bottom-up human pose estimation via disentangled Keypoint regression [C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021: 14671-14681. doi: 10.1109/cvpr46437.2021.01444http://dx.doi.org/10.1109/cvpr46437.2021.01444
MAJI D, NAGORI S, MATHEW M, et al. YOLO-Pose: enhancing YOLO for multi person pose estimation using object Keypoint similarity loss [C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New Orleans, LA, USA: IEEE, 2022: 2636-2645. doi: 10.1109/cvprw56347.2022.00297http://dx.doi.org/10.1109/cvprw56347.2022.00297
董潇潇,何小海,吴晓红,等.基于注意力掩模融合的目标检测算法[J].液晶与显示,2019,34(8):825-833. doi: 10.3788/yjyxs20193408.0825http://dx.doi.org/10.3788/yjyxs20193408.0825
DONG X X, HE X H, WU X H, et al. Object detection algorithm based on attention mask fusion [J]. Chinese Journal of Liquid Crystals and Displays, 2019, 34(8): 825-833. (in Chinese). doi: 10.3788/yjyxs20193408.0825http://dx.doi.org/10.3788/yjyxs20193408.0825
朱张莉,饶元,吴渊,等.注意力机制在深度学习中的研究进展[J].中文信息学报,2019,33(6):1-11. doi: 10.3969/j.issn.1003-0077.2019.06.001http://dx.doi.org/10.3969/j.issn.1003-0077.2019.06.001
ZHU Z L, RAO Y, WU Y, et al. Research progress of attention mechanism in deep learning [J]. Journal of Chinese Information Processing, 2019, 33(6): 1-11. (in Chinese). doi: 10.3969/j.issn.1003-0077.2019.06.001http://dx.doi.org/10.3969/j.issn.1003-0077.2019.06.001
CHU X, YANG W, OUYANG W, et al. Multi-context attention for human pose estimation [C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 5669-5678. doi: 10.1109/cvpr.2017.601http://dx.doi.org/10.1109/cvpr.2017.601
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015.
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7132-7141. doi: 10.1109/cvpr.2018.00745http://dx.doi.org/10.1109/cvpr.2018.00745
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer, 2018: 3-19. doi: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1
REDMON J, FARHADI A. YOLOv3: An incremental improvement [J/OL]. arXiv, 2018: 1804.02767. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
CHOLLET F. Xception: deep learning with depthwise separable convolutions [C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1800-1807. doi: 10.1109/cvpr.2017.195http://dx.doi.org/10.1109/cvpr.2017.195
LUO W J, LI Y J, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks [C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc., 2016.
WANG F, WANG C L, CHEN M L, et al. Far-field super-resolution ghost imaging with a deep neural network constraint [J]. Light: Science & Applications, 2022, 11(1): 1. doi: 10.1038/s41377-021-00680-whttp://dx.doi.org/10.1038/s41377-021-00680-w
0
浏览量
31
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构