Lin LIN, Yan-jie WANG, Hai-chao SUN. Object 6D pose estimation algorithm based on improved heatmap loss function[J]. Chinese journal of liquid crystals and displays, 2022, 37(7): 913-923.
DOI:
Lin LIN, Yan-jie WANG, Hai-chao SUN. Object 6D pose estimation algorithm based on improved heatmap loss function[J]. Chinese journal of liquid crystals and displays, 2022, 37(7): 913-923. DOI: 10.37188/CJLCD.2021-0317.
Object 6D pose estimation algorithm based on improved heatmap loss function
In view of the problem of low precision and slow training of heatmap regression network trained by mean square error (MSE) loss function used in traditional heatmap regression, the loss function Heatmap Wing Loss (HWing Loss) for heatmap regression is proposed in this thesis. In terms of different pixel values, the loss function has different loss function values, and the loss function gradient of foreground pixels is larger, which can make the network focus more on the foreground pixels and make the heatmap regression more accurate and faster. In line with the distribution characteristics of the heatmap, the keypoint inference method based on the Gaussian distribution is adopted in this thesis to reduce the quantization error when the heatmap infers the keypoints. By taking the two points as the basis, it constructs a new monocular pose estimation algorithm based on keypoint positioning. According to the experiments, in contrast with the algorithm using MSE Loss, the pose estimation algorithm using HWing Loss has a higher ADD(-S) accuracy rate, which reaches 88.8% on the LINEMOD dataset. Meanwhile, the performance is better than other recent pose estimation algorithms based on deep learning. The algorithm in this thesis can run at the fastest speed of 25 fps on RTX3080 GPU, in which the high speed and performance can be both embodied.
关键词
Keywords
references
RUBLEE E , RABAUD V , KONOLIGE K , et al . ORB: an efficient alternative to SIFT or SURF [C]// 2011 International Conference on Computer Vision . Barcelona : IEEE , 2011 : 2564 - 2571 . doi: 10.1109/iccv.2011.6126544 http://dx.doi.org/10.1109/iccv.2011.6126544
BAY H , ESS A , TUYTELAARS T , et al . Speeded-up robust features (SURF) [J]. Computer Vision and Image Understanding , 2008 , 110 ( 3 ): 346 - 359 .
DING N N , LIU Y Y , ZHU M . Extracting image feature points using scale-interaction of mexican-hat wavelets [J]. Chinese Journal of Liquid Crystals and Displays , 2012 , 27 ( 1 ): 125 - 129 . (in Chinese) . doi: 10.3788/yjyxs20122701.0125 http://dx.doi.org/10.3788/yjyxs20122701.0125
DO T T , CAI M , PHAM T , et al . Deep-6DPose: recovering 6D object pose from a single RGB image [EB/OL]. ( 2018-02-28 ). https://arxiv.org/abs/1802.10367v1 https://arxiv.org/abs/1802.10367v1 .
SUNDERMEYER M , MARTON Z C , DURNER M , et al . Augmented autoencoders: implicit 3D orientation learning for 6D object detection [J]. International Journal of Computer Vision , 2020 , 128 ( 3 ): 714 - 729 . doi: 10.1007/s11263-019-01243-8 http://dx.doi.org/10.1007/s11263-019-01243-8
TEKIN B , SINHA S N , FUA P , et al . Real-time seamless single shot 6D object pose prediction [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 292 - 301 . doi: 10.1109/cvpr.2018.00038 http://dx.doi.org/10.1109/cvpr.2018.00038
ZHAO Z L , PENG G , WANG H Y , et al . Estimating 6D pose from localizing designated surface keypoints [EB/OL]. ( 2018-12-04 ). https://arxiv.org/abs/1812.01387 https://arxiv.org/abs/1812.01387 .
PARK K , PATTEN T , VINCZE M . Pix2Pose: pixel-wise coordinate regression of objects for 6D pose estimation [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul : IEEE , 2019 : 7667 - 7676 . doi: 10.1109/iccv.2019.00776 http://dx.doi.org/10.1109/iccv.2019.00776
PENG S D , LIU Y , HUANG Q X , et al . PVNet: pixel-wise voting network for 6DoF pose estimation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 4556 - 4565 . doi: 10.1109/cvpr.2019.00469 http://dx.doi.org/10.1109/cvpr.2019.00469
WANG X Y , BO L F , LI F X . Adaptive wing loss for robust face alignment via heatmap regression [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul : IEEE , 2019 : 6970 - 6980 . doi: 10.1109/iccv.2019.00707 http://dx.doi.org/10.1109/iccv.2019.00707
SUN K , XIAO B , LIU D , et al . Deep high-resolution representation learning for human pose estimation [C]// IEEE Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 5693 - 5703 . doi: 10.1109/cvpr.2019.00584 http://dx.doi.org/10.1109/cvpr.2019.00584
WANG J D , SUN K , CHENG T H , et al . Deep high-resolution representation learning for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 10 ): 3349 - 3364 . doi: 10.1109/tpami.2020.2983686 http://dx.doi.org/10.1109/tpami.2020.2983686
ZHANG F , ZHU X T , DAI H B , et al . Distribution-aware coordinate representation for human pose estimation [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle : IEEE , 2020 : 7091 - 7100 . doi: 10.1109/cvpr42600.2020.00712 http://dx.doi.org/10.1109/cvpr42600.2020.00712
LEPETIT V , MORENO-NOGUER F , FUA P . EP n P: an accurate O ( n ) solution to the P n P problem [J]. International Journal of Computer Vision , 2009 , 81 ( 2 ): 155 - 166 . doi: 10.1007/s11263-008-0152-6 http://dx.doi.org/10.1007/s11263-008-0152-6
XIAO J X , HAYS J , EHINGER K A , et al . SUN database: large-scale scene recognition from abbey to zoo [C]// 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition . San Francisco : IEEE , 2010 : 3485 - 3492 . doi: 10.1109/cvpr.2010.5539970 http://dx.doi.org/10.1109/cvpr.2010.5539970
XIANG Y , SCHMIDT T , NARAYANAN V , et al . PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes [C]// Proceedings of the 14th Robotics: Science and Systems . Pittsburgh : IEEE , 2018 .
LI Y , WANG G , JI X Y , et al . DeepIM: deep iterative matching for 6D pose estimation [J]. International Journal of Computer Vision , 2020 , 128 ( 3 ): 657 - 678 . doi: 10.1007/s11263-019-01250-9 http://dx.doi.org/10.1007/s11263-019-01250-9
OBERWEGER M , RAD M , LEPETIT V . Making deep heatmaps robust to partial occlusions for 3D object pose estimation [C]// Proceedings of the 15th European Conference on Computer Vision . Munich : Springer , 2018 : 125 - 141 . doi: 10.1007/978-3-030-01267-0_8 http://dx.doi.org/10.1007/978-3-030-01267-0_8