학술논문

Progressive Temporal Feature Alignment Network for Video Inpainting
Document Type
Conference
Source
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) CVPR Computer Vision and Pattern Recognition (CVPR), 2021 IEEE/CVF Conference on. :16443-16452 Jun, 2021
Subject
Computing and Processing
Deep learning
Visualization
Computer vision
Three-dimensional displays
Image resolution
Convolution
Feature extraction
Language
ISSN
2575-7075
Abstract
Video inpainting aims to fill spatiotemporal "corrupted" regions with plausible content. To achieve this goal, it is necessary to find correspondences from neighbouring frames to faithfully hallucinate the unknown con-tent. Current methods achieve this goal through attention, flow-based warping, or 3D temporal convolution. However, flow-based warping can create artifacts when optical flow is not accurate, while temporal convolution may suffer from spatial misalignment. We propose ‘Progressive Temporal Feature Alignment Network’, which progressively enriches features extracted from the current frame with the feature warped from neighbouring frames using optical flow. Our approach corrects the spatial misalignment in the temporal feature propagation stage, greatly improving visual quality and temporal consistency of the inpainted videos. Using the proposed architecture, we achieve state-of-the-art performance on the DAVIS and FVI datasets compared to existing deep learning approaches. Code is available at https://github.com/MaureenZOU/TSAM.