학술논문

SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing
Document Type
Conference
Source
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) CVPR Computer Vision and Pattern Recognition (CVPR), 2022 IEEE/CVF Conference on. :19698-19707 Jun, 2022
Subject
Computing and Processing
Photography
Computer vision
Image color analysis
Semantics
Focusing
Software
Pattern recognition
Computational photography; Vision + language
Language
ISSN
2575-7075
Abstract
Recently, large pretrained models (e.g., BERT, Style-GAN, CLIP) show great knowledge transfer and generalization capability on various downstream tasks within their domains. Inspired by these efforts, in this paper we propose a unified model for open-domain image editing focusing on color and tone adjustment of open-domain images while keeping their original content and structure. Our model learns a unified editing space that is more semantic, intu-itive, and easy to manipulate than the operation space (e.g., contrast, brightness, color curve) used in many existing photo editing softwares. Our model belongs to the image-to-image translation framework which consists of an image encoder and decoder, and is trained on pairs of before-and-after edited images to produce multimodal outputs. We show that by inverting image pairs into latent codes of the learned editing space, our model can be leveraged for vari-ous downstream editing tasks such as language-guided image editing, personalized editing, editing-style clustering, retrieval, etc. We extensively study the unique properties of the editing space in experiments and demonstrate superior performance on the aforementioned tasks 1 1 Code and supplementary material can be found at the project page https://jshi31.github.io/SpaceEdit.