학술논문

Pixel-Wise Grasp Detection via Twin Deconvolution and Multi-Dimensional Attention
Document Type
Periodical
Source
IEEE Transactions on Circuits and Systems for Video Technology IEEE Trans. Circuits Syst. Video Technol. Circuits and Systems for Video Technology, IEEE Transactions on. 33(8):4002-4010 Aug, 2023
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Feature extraction
Convolution
Deconvolution
Decoding
Robots
Kernel
Correlation
Grasp detection
multi-dimensional attention
twin deconvolution
Language
ISSN
1051-8215
1558-2205
Abstract
The grasp detection is crucial to high-quality robotic grasping. Typically, the mainstream encoder-decoder regression solution is attractive due to its high accuracy and efficiency, however, it is still challenging to solve the checkerboard artifacts from the uneven overlap of convolution results in decoder, and features from the encoder also need to be further refined. In this paper, a novel pixel-wise grasp detection network is proposed, which is composed of an encoder, a multi-dimensional attention bottleneck, and a decoder based on twin deconvolution. The proposed decoder introduces a twin branch upon the original transposed convolution branch. Through the overlap degree matrix provided by the twin branch, the original branch is re-weighted and then the checkerboard artifacts of the original branch are eliminated. Besides, to deeply explore the intrinsic relationship of features and strengthen feature discrimination, residual multi-head self-attention, cross-amplitude attention, and channel attention are integrated together. As a result, adaptive feature refinement is achieved. The effectiveness of the proposed method is verified by experiments.