학술논문

StyleGAN-Based Advanced Semantic Segment Encoder for Generative AI
Document Type
Periodical
Author
Source
IT Professional IT Prof. IT Professional. 26(2):17-23 Apr, 2024
Subject
Computing and Processing
Engineering Profession
Components, Circuits, Devices and Systems
Power, Energy and Industry Applications
Costs
Generative AI
Noise measurement
Semantics
Image restoration
Image processing
Artificial intelligence
Face recognition
Semantic segmentation
Facial features
Language
ISSN
1520-9202
1941-045X
Abstract
StyleGAN is a widely used model in various AI domains that generates high-quality images. It has many advantages but has the disadvantage of per-pixel noise inputs. These noise inputs used from StyleGAN are independent of location information and have a negative impact on natural location information learning because random noise is inserted in pixel units at intervals. This problem was even more problematic in the area of creating human faces. StyleGAN3 was announced to overcome this, but it did not completely solve the existing problems. If the angle of a human face is more than 30° from the front, the restoration rate further decreases. In this article, we propose an advanced semantic segment encoder that accurately generates eyes, nose, and mouth even when the angle of a human face is rotated more than 60°. We developed a face-angle analyzer to accurately measure the angle of a person’s face. The proposed idea improved restoration performance by approximately 30% compared to existing encoders when the face is not straight ahead.