학술논문

G-GOP: Generative Pose Estimation of Reflective Texture-Less Metal Parts With Global-Observation-Point Priors
Document Type
Periodical
Source
IEEE/ASME Transactions on Mechatronics IEEE/ASME Trans. Mechatron. Mechatronics, IEEE/ASME Transactions on. 29(1):154-165 Feb, 2024
Subject
Power, Energy and Industry Applications
Components, Circuits, Devices and Systems
Pose estimation
Metals
Training
Robustness
Mechatronics
Feature extraction
Cameras
Deep learning
generative models
reflective
6-D pose estimation
texture-less
Language
ISSN
1083-4435
1941-014X
Abstract
6-D pose estimation of reflective and texture-less metal parts remains a challenge in intelligent manufacturing. A feature-to-image method has been recently proposed to address this problem. However, this method still contains unclear factors that impact its precision and robustness, as well as those of other conventional methods. To address this issue, first, the mechanisms of conventional methods are theoretically analyzed, and we find and summarize two factors. The errors caused by these factors are defined as the discrete template substitution (DTS) error, which is caused by the discretization transformation between intermediate results and final poses, and the center approximation substitution (CAS) error, which is caused by differences in contour projections in different local fields of view. Moreover, a novel feature-to-image method is proposed to reduce these errors. It utilizes a novel pose representation called the observation pose. Unlike the traditional pose, it is represented with six distinct camera motions to decouple the influence of different parameters on the projected image. The conversion between it and the traditional pose is not discrete, so it can reduce the DTS error. Additionally, a new prior called GOP is used, which is a global point and converted from the output of the detector to describe the local field of view. Compared to conventional methods, the proposed method can establish a mapping between different local fields of view and differences in contour projection to reduce CAS error. Three experiments were carried out and compared with different methods. Experiments contain 2250 different RGB images captured by a single MV-CA050-11UC industrial RGB camera fixed on the MELFA RV13FD 6-DoF robot. Compared with state-of-the-art methods, the ADD-S, average rotation, and translation precision were increased by 4.8%, 23%, and 17%, respectively, which demonstrate that the proposed method has superior robustness and precision.