학술논문

Deep Video Decaptioning via Subtitle Mask Prediction and Inpainting

Document Type

Conference

Author

Tu, Yifei; Li, Yuhang; Cai, Feifan; Wang, Chao; Liang, Bing; Fan, Jiaxin; Ding, Youdong

Source

2022 IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), 2022 IEEE 5th. 5:1836-1839 Dec, 2022

Subject

Communication, Networking and Broadcast Technologies
Computing and Processing
Engineering Profession
Robotics and Control Systems
Visualization
Automation
Streaming media
Maintenance engineering
Real-time systems
Information management
video decaptioning
mask prediction
deep learning
encoder-decoder

Language

ISSN

2693-2776

Abstract

Video decaptioning is the process of automatically removing subtitles from video frames and inpainting the captioned regions. However, directly transferring the deep-learning-based image inpainting methods to video decaptioning scenarios always requires the masks of subtitled regions, which is unavailable for subtitled video frames. To address these issues, we propose a two-stage lightweight framework. The first caption mask prediction stage uses an encoder-decoder full convolutional network with residual blocks to predict the caption mask. The second background inpainting stage uses an encoder-decoder structure with attention modules and the skip connection to repair the background areas. Extensive experiments demonstrate that our proposed model can produce better visual results and outperform state-of-the-art methods.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송