TY - JOUR
T1 - A Robust Convolutional Neural Network for 6D Object Pose Estimation from RGB Image with Distance Regularization Voting Loss
AU - Ullah, Faheem
AU - Wei, Wu
AU - Daradkeh, Yousef Ibrahim
AU - Javed, Muhammad
AU - Rabbi, Ihsan
AU - Al Juaid, Hanan
N1 - Publisher Copyright:
© 2022 Faheem Ullah et al.
PY - 2022
Y1 - 2022
N2 - Six-degree (6D) pose estimation of objects is important for robot manipulation but at the same time challenging when dealing with occluded and textureless objects. To overcome this challenge, the proposed method presents an end-to-end robust network for real-time 6D pose estimation of rigid objects using the RGB image. In this proposed method, a fully convolutional network with a features pyramid is developed that effectively boosts the accuracy of pixelwise labeling and direction unit vector field that take part in the voting process for object keypoints estimation. The network further takes into account measuring the distance between pixel and keypoint, which aims to help select accurate hypotheses in the RANSAC process. This avoids hypothesis deviations caused by the errors due to direction unit vectors in cases of distant pixels from keypoints. A vectorial distance regularization loss function is used to help Perspective-n-Point find 2D-3D correspondences between 3D object keypoints and their estimated corresponding 2D counterparts. Experiments are performed on widely used LINEMOD and occlusion LINEMOD datasets with ADD (-S) and 2D projection evaluation metrics. The results show that our method improves pose estimation performance compared to the state-of-the-art while still achieving real-time efficiency.
AB - Six-degree (6D) pose estimation of objects is important for robot manipulation but at the same time challenging when dealing with occluded and textureless objects. To overcome this challenge, the proposed method presents an end-to-end robust network for real-time 6D pose estimation of rigid objects using the RGB image. In this proposed method, a fully convolutional network with a features pyramid is developed that effectively boosts the accuracy of pixelwise labeling and direction unit vector field that take part in the voting process for object keypoints estimation. The network further takes into account measuring the distance between pixel and keypoint, which aims to help select accurate hypotheses in the RANSAC process. This avoids hypothesis deviations caused by the errors due to direction unit vectors in cases of distant pixels from keypoints. A vectorial distance regularization loss function is used to help Perspective-n-Point find 2D-3D correspondences between 3D object keypoints and their estimated corresponding 2D counterparts. Experiments are performed on widely used LINEMOD and occlusion LINEMOD datasets with ADD (-S) and 2D projection evaluation metrics. The results show that our method improves pose estimation performance compared to the state-of-the-art while still achieving real-time efficiency.
UR - http://www.scopus.com/inward/record.url?scp=85136713083&partnerID=8YFLogxK
U2 - 10.1155/2022/2037141
DO - 10.1155/2022/2037141
M3 - Article
AN - SCOPUS:85136713083
SN - 1058-9244
VL - 2022
JO - Scientific Programming
JF - Scientific Programming
M1 - 2037141
ER -