학술논문

Conceptual and Hierarchical Latent Space Decomposition for Face Editing
Document Type
Conference
Source
2023 IEEE/CVF International Conference on Computer Vision (ICCV) ICCV Computer Vision (ICCV), 2023 IEEE/CVF International Conference on. :7214-7223 Oct, 2023
Subject
Computing and Processing
Signal Processing and Analysis
Solid modeling
Three-dimensional displays
Image synthesis
Pipelines
Lighting
Aerospace electronics
Generative adversarial networks
Language
ISSN
2380-7504
Abstract
Generative Adversarial Networks (GANs) can produce photo-realistic results using an unconditional image-generation pipeline. However, the images generated by GANs (e.g., StyleGAN) are entangled in feature spaces, which makes it difficult to interpret and control the contents of images. In this paper, we present an encoder-decoder model that decomposes the entangled GAN space into a conceptual and hierarchical latent space in a self-supervised manner. The outputs of 3D morphable face models are leveraged to independently control image synthesis parameters like pose, expression, and illumination. For this purpose, a novel latent space decomposition pipeline is introduced using transformer networks and generative models. Later, this new space is used to optimize a transformer-based GAN space controller for face editing. In this work, a StyleGAN2 model for faces is utilized. Since our method manipulates only GAN features, the photo-realism of Style-GAN2 is fully preserved. The results demonstrate that our method qualitatively and quantitatively outperforms baselines in terms of identity preservation and editing precision.