학술논문

Residual Block Fusion in Low Complexity Neural Network-Based In-loop Filtering for Video Compression
Document Type
Conference
Source
2024 Data Compression Conference (DCC) DCC Data Compression Conference (DCC), 2024. :392-401 Mar, 2024
Subject
Communication, Networking and Broadcast Technologies
Signal Processing and Analysis
Convolutional codes
Training
Convolution
Filtering
Data compression
Video compression
Encoding
Language
ISSN
2375-0359
Abstract
In this paper, a novel low complexity residual block fusion (RBF) based split luma chroma architecture is proposed to improve coding efficiency of neural network-based in-loop filter in video compression. The residual block in this architecture consists of a 1x1 convolution layer with wide activation and a regular 3x3 convolutional layer decomposed into 1x1 pointwise convolutions and 1x3/3x1 separable convolutions via Canonical Polyadic (CP) decomposition to reduce complexity. By adjusting the location of the skip connection in each residual block, the fusion of adjacent 1x1 pointwise convolutions is performed. The RBF backbone consists of a new wide activation that directly starts with PReLU and is followed by a 1x1 convolution, while the 1x1 layers after CP decomposition are fully fused. This new fusion design reduces the complexity from 17.05 kMac/Pixel to 16.56 kMac/Pixel and the number of convolutional layers by 13%. The experimental results show that new RBF architecture’s BDRate is {-0.11%, -0.31%, -0.33%} under All Intra (AI) and {-0.14%, 0.66%, 1.56%} under Random Access (RA) compared to existing residual block design, while the BD-Rate of the proposed RBF loop filer compared to VTM anchor is {-4.77%, -9.14%, -9.13%} under AI and {-5.46%, -9.31%, -9.20%} under RA. The actual decoding time is reduced by around 5% after residual block fusion. The BD-Rate and kMac/Pixel plot also shows superior trade-off between complexity and coding gain compared to state-of-the-art filters.