학술논문

High-Fidelity Image Compression with Score-based Generative Models

Document Type

Working Paper

Author

Hoogeboom, Emiel; Agustsson, Eirikur; Mentzer, Fabian; Versari, Luca; Toderici, George; Theis, Lucas

Source

Subject

Electrical Engineering and Systems Science - Image and Video Processing
Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning
Statistics - Machine Learning

Language

Abstract

Despite the tremendous success of diffusion generative models in text-to-image generation, replicating this success in the domain of image compression has proven difficult. In this paper, we demonstrate that diffusion can significantly improve perceptual quality at a given bit-rate, outperforming state-of-the-art approaches PO-ELIC and HiFiC as measured by FID score. This is achieved using a simple but theoretically motivated two-stage approach combining an autoencoder targeting MSE followed by a further score-based decoder. However, as we will show, implementation details matter and the optimal design decisions can differ greatly from typical text-to-image models.

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송