학술논문

Quantization of Generative Adversarial Networks for Efficient Inference: A Methodological Study

Document Type

Conference

Author

Source

2022 26th International Conference on Pattern Recognition (ICPR) Pattern Recognition (ICPR), 2022 26th International Conference on. :2179-2185 Aug, 2022

Subject

Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Performance evaluation
Training
Quantization (signal)
Computational modeling
Semantics
Computer architecture
Neural network compression

Language

ISSN

2831-7475

Abstract

Generative adversarial networks (GANs) have an enormous potential impact on digital content creation, e.g., photorealistic digital avatars, semantic content editing, and quality enhancement of speech and images. However, the performance of modern GANs comes together with massive amounts of computations performed during the inference and high energy consumption. That complicates, or even makes impossible, their deployment on edge devices. The problem can be reduced with quantization—a neural network compression technique that facilitates hardware-friendly inference by replacing floating-point computations with low-bit integer ones. While quantization is well established for discriminative models, the performance of modern quantization techniques in application to GANs remains unclear. GANs generate content of a more complex structure than discriminative models, and thus quantization of GANs is significantly more challenging. To tackle this problem, we perform an extensive experimental study of state-of-art quantization techniques on three diverse GAN architectures, namely StyleGAN, Self-Attention GAN, and CycleGAN. As a result, we discovered practical recipes that allowed us to successfully quantize these models for inference with 4/8-bit weights and 8-bit activations while preserving the quality of the original full-precision models.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송