학술논문

PortraitBooth: A Versatile Portrait Model for Fast Identity-Preserved Personalization
Document Type
Conference
Source
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) CVPR Computer Vision and Pattern Recognition (CVPR), 2024 IEEE/CVF Conference on. :27070-27080 Jun, 2024
Subject
Computing and Processing
Training
Limiting
Image recognition
Image synthesis
Scalability
Face recognition
Text to image
Language
ISSN
2575-7075
Abstract
Recent advancements in personalized image generation using diffusion models have been noteworthy. However, existing methods suffer from inefficiencies due to the requirement for subject-specific fine-tuning. This computationally intensive process hinders efficient deployment, limiting practical usability. Moreover, these methods often grapple with identity distortion and limited expression diversity. In light of these challenges, we propose Portrait-Booth, an innovative approach designed for high efficiency, robust identity preservation, and expression-editable text-to-image generation, without the need for fine-tuning. Por-traitBooth leverages subject embeddingsfrom aface recognition model for personalized image generation without fine-tuning. It eliminates computational overhead and mitigates identity distortion. The introduced dynamic identity preservation strategy further ensures close resemblance to the original image identity. Moreover, PortraitBooth incorporates emotion-aware cross-attention control for diverse facial expressions in generated images, supporting text-driven expression editing. Its scalability enables efficient and high-quality image creation, including multi-subject generation. Extensive results demonstrate superior performance over other state-of-the-art methods in both single and multiple image generation scenarios. Our project page is at https://portraitbooth.github.io.