학술논문

Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression
Document Type
Periodical
Author
Source
IEEE Transactions on Circuits and Systems for Video Technology IEEE Trans. Circuits Syst. Video Technol. Circuits and Systems for Video Technology, IEEE Transactions on. 34(5):3082-3095 May, 2024
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Quantization (signal)
Image coding
Task analysis
Transforms
Rate-distortion
Complexity theory
Distortion
Learned image compression
model quantization
task loss
Language
ISSN
1051-8215
1558-2205
Abstract
Quantizing a floating-point neural network to its fixed-point representation is crucial for Learned Image Compression (LIC) because it improves decoding consistency for interoperability and reduces space-time complexity for implementation. Existing solutions often have to retrain the network for model quantization, which is time-consuming and impractical to some extent. This work suggests using Post-Training Quantization (PTQ) to process pretrained, off-the-shelf LIC models. We theoretically prove that minimizing quantization-induced mean square error (MSE) of model parameters (e.g., weight, bias, and activation) in PTQ is sub-optimal for compression tasks and thus develop a novel Rate-Distortion (R-D) Optimized PTQ (RDO-PTQ) to best retain the compression performance. Given a LIC model, RDO-PTQ layer-wisely determines the quantization parameters to transform the original floating-point parameters in 32-bit precision (FP32) to fixed-point ones at 8-bit precision (INT8), for which a tiny calibration image set is compressed in optimization to minimize R-D loss. Experiments reveal the outstanding efficiency of the proposed method on different LICs, showing the closest coding performance to their floating-point counterparts. Our method is a lightweight and plug-and-play approach without retraining model parameters but just adjusting quantization parameters, which is attractive to practitioners. Such an RDO-PTQ is a task-oriented PTQ scheme, which is then extended to quantize popular super-resolution and image classification models with negligible performance loss, further evidencing the generalization of our methodology. Related materials will be released at https://njuvision.github.io/RDO-PTQ.