
浏览全部资源
扫码关注微信
1.中国科学院 长春光学精密机械与物理研究所, 吉林 长春 130033
2.中国科学院大学,北京 100049
[ "林 林(1997—),男,内蒙古赤峰人,硕士研究生,2019年于中国科学技术大学获得学士学位,主要从事计算机视觉方面的研究。E-mail:linlin19@mails.ucas.ac.cn" ]
[ "王延杰(1963—),男,吉林长春人,硕士,研究员,1999年于中国科学院长春光学精密机械与物理研究所获得硕士学位,主要从事数字图像处理方面的研究。E-mails:wangyj@ciomp.ac.cn" ]
收稿日期:2021-12-03,
修回日期:2022-01-09,
纸质出版日期:2022-07-05
移动端阅览
林林, 王延杰, 孙海超. 基于改进热图损失函数的目标6D姿态估计算法[J]. 液晶与显示, 2022,37(7):913-923.
Lin LIN, Yan-jie WANG, Hai-chao SUN. Object 6D pose estimation algorithm based on improved heatmap loss function[J]. Chinese journal of liquid crystals and displays, 2022, 37(7): 913-923.
林林, 王延杰, 孙海超. 基于改进热图损失函数的目标6D姿态估计算法[J]. 液晶与显示, 2022,37(7):913-923. DOI: 10.37188/CJLCD.2021-0317.
Lin LIN, Yan-jie WANG, Hai-chao SUN. Object 6D pose estimation algorithm based on improved heatmap loss function[J]. Chinese journal of liquid crystals and displays, 2022, 37(7): 913-923. DOI: 10.37188/CJLCD.2021-0317.
针对传统热图回归使用的均方误差(MSE)损失函数训练热图回归网络的精度不高且训练缓慢的问题,本文提出了用于热图回归的损失函数Heatmap Wing Loss(HWing Loss)。该损失函数对于不同的像素值有不同的损失函数值,前景像素的损失函数梯度更大,可以使网络更加关注前景像素,使热图回归更加准确快速。同时根据热图分布特性,使用基于高斯分布的关键点推理方法减小热图推断关键点时的量化误差。以此两点为基础,构造新的基于关键点定位的单目标姿态估计的算法。实验结果表明,相比于使用MSE Loss的算法,使用HWing Loss的姿态估计算法有更高的ADD(-S)准确率,在LINEMOD数据集上达到了88.8%,性能优于近期其他的基于深度学习的姿态估计算法。本文算法在RTX3080 GPU上最快能以25 fps的速度运行,兼具速度与性能优势。
In view of the problem of low precision and slow training of heatmap regression network trained by mean square error (MSE) loss function used in traditional heatmap regression, the loss function Heatmap Wing Loss (HWing Loss) for heatmap regression is proposed in this thesis. In terms of different pixel values, the loss function has different loss function values, and the loss function gradient of foreground pixels is larger, which can make the network focus more on the foreground pixels and make the heatmap regression more accurate and faster. In line with the distribution characteristics of the heatmap, the keypoint inference method based on the Gaussian distribution is adopted in this thesis to reduce the quantization error when the heatmap infers the keypoints. By taking the two points as the basis, it constructs a new monocular pose estimation algorithm based on keypoint positioning. According to the experiments, in contrast with the algorithm using MSE Loss, the pose estimation algorithm using HWing Loss has a higher ADD(-S) accuracy rate, which reaches 88.8% on the LINEMOD dataset. Meanwhile, the performance is better than other recent pose estimation algorithms based on deep learning. The algorithm in this thesis can run at the fastest speed of 25 fps on RTX3080 GPU, in which the high speed and performance can be both embodied.
RUBLEE E , RABAUD V , KONOLIGE K , et al . ORB: an efficient alternative to SIFT or SURF [C]// 2011 International Conference on Computer Vision . Barcelona : IEEE , 2011 : 2564 - 2571 . doi: 10.1109/iccv.2011.6126544 http://dx.doi.org/10.1109/iccv.2011.6126544
BAY H , ESS A , TUYTELAARS T , et al . Speeded-up robust features (SURF) [J]. Computer Vision and Image Understanding , 2008 , 110 ( 3 ): 346 - 359 .
丁南南 , 刘艳滢 , 朱明 . 尺度相互作用墨西哥帽小波提取图像特征点 [J]. 液晶与显示 , 2012 , 27 ( 1 ): 125 - 129 . doi: 10.3788/yjyxs20122701.0125 http://dx.doi.org/10.3788/yjyxs20122701.0125
DING N N , LIU Y Y , ZHU M . Extracting image feature points using scale-interaction of mexican-hat wavelets [J]. Chinese Journal of Liquid Crystals and Displays , 2012 , 27 ( 1 ): 125 - 129 . (in Chinese) . doi: 10.3788/yjyxs20122701.0125 http://dx.doi.org/10.3788/yjyxs20122701.0125
DO T T , CAI M , PHAM T , et al . Deep-6DPose: recovering 6D object pose from a single RGB image [EB/OL]. ( 2018-02-28 ). https://arxiv.org/abs/1802.10367v1 https://arxiv.org/abs/1802.10367v1 .
SUNDERMEYER M , MARTON Z C , DURNER M , et al . Augmented autoencoders: implicit 3D orientation learning for 6D object detection [J]. International Journal of Computer Vision , 2020 , 128 ( 3 ): 714 - 729 . doi: 10.1007/s11263-019-01243-8 http://dx.doi.org/10.1007/s11263-019-01243-8
TEKIN B , SINHA S N , FUA P , et al . Real-time seamless single shot 6D object pose prediction [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 292 - 301 . doi: 10.1109/cvpr.2018.00038 http://dx.doi.org/10.1109/cvpr.2018.00038
ZHAO Z L , PENG G , WANG H Y , et al . Estimating 6D pose from localizing designated surface keypoints [EB/OL]. ( 2018-12-04 ). https://arxiv.org/abs/1812.01387 https://arxiv.org/abs/1812.01387 .
PARK K , PATTEN T , VINCZE M . Pix2Pose: pixel-wise coordinate regression of objects for 6D pose estimation [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul : IEEE , 2019 : 7667 - 7676 . doi: 10.1109/iccv.2019.00776 http://dx.doi.org/10.1109/iccv.2019.00776
PENG S D , LIU Y , HUANG Q X , et al . PVNet: pixel-wise voting network for 6DoF pose estimation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 4556 - 4565 . doi: 10.1109/cvpr.2019.00469 http://dx.doi.org/10.1109/cvpr.2019.00469
WANG X Y , BO L F , LI F X . Adaptive wing loss for robust face alignment via heatmap regression [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul : IEEE , 2019 : 6970 - 6980 . doi: 10.1109/iccv.2019.00707 http://dx.doi.org/10.1109/iccv.2019.00707
SUN K , XIAO B , LIU D , et al . Deep high-resolution representation learning for human pose estimation [C]// IEEE Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 5693 - 5703 . doi: 10.1109/cvpr.2019.00584 http://dx.doi.org/10.1109/cvpr.2019.00584
WANG J D , SUN K , CHENG T H , et al . Deep high-resolution representation learning for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 10 ): 3349 - 3364 . doi: 10.1109/tpami.2020.2983686 http://dx.doi.org/10.1109/tpami.2020.2983686
ZHANG F , ZHU X T , DAI H B , et al . Distribution-aware coordinate representation for human pose estimation [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle : IEEE , 2020 : 7091 - 7100 . doi: 10.1109/cvpr42600.2020.00712 http://dx.doi.org/10.1109/cvpr42600.2020.00712
LEPETIT V , MORENO-NOGUER F , FUA P . EP n P: an accurate O ( n ) solution to the P n P problem [J]. International Journal of Computer Vision , 2009 , 81 ( 2 ): 155 - 166 . doi: 10.1007/s11263-008-0152-6 http://dx.doi.org/10.1007/s11263-008-0152-6
XIAO J X , HAYS J , EHINGER K A , et al . SUN database: large-scale scene recognition from abbey to zoo [C]// 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition . San Francisco : IEEE , 2010 : 3485 - 3492 . doi: 10.1109/cvpr.2010.5539970 http://dx.doi.org/10.1109/cvpr.2010.5539970
XIANG Y , SCHMIDT T , NARAYANAN V , et al . PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes [C]// Proceedings of the 14th Robotics: Science and Systems . Pittsburgh : IEEE , 2018 .
LI Y , WANG G , JI X Y , et al . DeepIM: deep iterative matching for 6D pose estimation [J]. International Journal of Computer Vision , 2020 , 128 ( 3 ): 657 - 678 . doi: 10.1007/s11263-019-01250-9 http://dx.doi.org/10.1007/s11263-019-01250-9
OBERWEGER M , RAD M , LEPETIT V . Making deep heatmaps robust to partial occlusions for 3D object pose estimation [C]// Proceedings of the 15th European Conference on Computer Vision . Munich : Springer , 2018 : 125 - 141 . doi: 10.1007/978-3-030-01267-0_8 http://dx.doi.org/10.1007/978-3-030-01267-0_8
0
浏览量
530
下载量
3
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621