학술논문

DCNet: Large-Scale Point Cloud Semantic Segmentation With Discriminative and Efficient Feature Aggregation
Document Type
Periodical
Source
IEEE Transactions on Circuits and Systems for Video Technology IEEE Trans. Circuits Syst. Video Technol. Circuits and Systems for Video Technology, IEEE Transactions on. 33(8):4083-4095 Aug, 2023
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Point cloud compression
Semantics
Three-dimensional displays
Semantic segmentation
Decoding
Aggregates
Feature extraction
point cloud
feature aggregation
attention
Language
ISSN
1051-8215
1558-2205
Abstract
The point cloud feature aggregation, which learns discriminative features from the disordered points, plays a key role for large-scale point cloud semantic segmentation. Most previous aggregation methods are based on sampling a representative point subset, i.e., by a carefully designed point density metric, facing expensive computation cost especially for large-scale point clouds. Even though speeding up the point sampling process is studied by several recent works, but the component points in the sampled subset are uncertain and may change randomly, thus leading to corrupted geometric structure and discarded edges for representing an object. Therefore, we propose the DCNet, which consists of a fast point random sampling based encoder-decoder structure and several fully connected layers for semantic segmentation. To overcome the key feature loss caused by random down-sampling, the DCNet develops two novel local feature aggregation schemes: Double attention and Consistent constraints, to learn features that are discriminative for the challenging scenarios as above. The former considers both topological and semantic similarity of neighboring points to generate attention features for discriminating classes with similar geometric structures. The latter develops class-consistent constraints between adjacent layers in the decoder stage, to guide each point to aggregate with high-level semantic features of points belonging to the same class from the previous layer, which is beneficial for distinguishing neighboring points of the same class on the boundary. We conduct experiments and compare the proposed DCNet with existing methods on two benchmarks S3DIS and Semantic3D. Experiments show that the mean Intersection-over-Union (mIoU) of our method outperforms state-of-the-art methods by 2-3%, based on the same fast random sampling, and is also comparable to latest sampling-slower but accuracy-higher methods. That is, our method achieves the optimal speed-accuracy trade-off in the field of point cloud segmentation.