학술논문

Denoised Non-Local Neural Network for Semantic Segmentation
Document Type
Periodical
Source
IEEE Transactions on Neural Networks and Learning Systems IEEE Trans. Neural Netw. Learning Syst. Neural Networks and Learning Systems, IEEE Transactions on. 35(5):7162-7174 May, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
General Topics for Engineers
Semantic segmentation
Reliability
Convolution
Computational modeling
Task analysis
Noise reduction
Context modeling
Attention
non-local (NL) network
reliability
semantic segmentation
Language
ISSN
2162-237X
2162-2388
Abstract
The non-local (NL) network has become a widely used technique for semantic segmentation, which computes an attention map to measure the relationships of each pixel pair. However, most of the current popular NL models tend to ignore the phenomenon that the calculated attention map appears to be very noisy, containing interclass and intraclass inconsistencies, which lowers the accuracy and reliability of the NL methods. In this article, we figuratively denote these inconsistencies as attention noises and explore the solutions to denoise them. Specifically, we inventively propose a denoised NL network, which consists of two primary modules, i.e., the global rectifying (GR) block and the local retention (LR) block, to eliminate the interclass and intraclass noises, respectively. First, GR adopts the class-level predictions to capture a binary map to distinguish whether the selected two pixels belong to the same category. Second, LR captures the ignored local dependencies and further uses them to rectify the unwanted hollows in the attention map. The experimental results on two challenging semantic segmentation datasets demonstrate the superior performance of our model. Without any external training data, our proposed denoised NL can achieve the state-of-the-art performance of 83.5% and 46.69% mean of classwise intersection over union (mIoU) on Cityscapes and ADE20K, respectively.