학술논문

Deformable Sprites for Unsupervised Video Decomposition
Document Type
Conference
Source
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) CVPR Computer Vision and Pattern Recognition (CVPR), 2022 IEEE/CVF Conference on. :2647-2656 Jun, 2022
Subject
Computing and Processing
Deformable models
Training
Computer vision
Pattern recognition
Internet
Sprites (computer)
Standards
Segmentation
grouping and shape analysis; Image and video synthesis and generation; Motion and tracking; Video analysis and understanding
Language
ISSN
2575-7075
Abstract
We describe a method to extract persistent elements of a dynamic scene from an input video. We represent each scene element as a Deformable Sprite consisting of three components: 1) a 2D texture image for the entire video, 2) per-frame masks for the element, and 3) non-rigid deformations that map the texture image into each video frame. The resulting decomposition allows for applications such as consistent video editing. Deformable Sprites are a type of video auto-encoder model that is optimized on individual videos, and does not require training on a large dataset, nor does it rely on pretrained models. Moreover, our method does not require object masks or other user input, and discovers moving objects of a wider variety than previous work. We evaluate our approach on standard video datasets and show qualitative results on a diverse array of Internet videos.