학술논문

Long-Term Temporally Consistent Unpaired Video Translation from Simulated Surgical 3D Data
Document Type
Conference
Source
2021 IEEE/CVF International Conference on Computer Vision (ICCV) ICCV Computer Vision (ICCV), 2021 IEEE/CVF International Conference on. :3323-3333 Oct, 2021
Subject
Computing and Processing
Training
Geometry
Computer vision
Three-dimensional displays
Limiting
Video sequences
Surgery
Medical
biological
and cell microscopy
Image and video synthesis
Language
ISSN
2380-7504
Abstract
Research in unpaired video translation has mainly focused on short-term temporal consistency by conditioning on neighboring frames. However for transfer from simulated to photorealistic sequences, available information on the underlying geometry offers potential for achieving global consistency across views. We propose a novel approach which combines unpaired image translation with neural rendering to transfer simulated to photorealistic surgical abdominal scenes. By introducing global learnable textures and a lighting-invariant view-consistency loss, our method produces consistent translations of arbitrary views and thus enables long-term consistent video synthesis. We design and test our model to generate video sequences from minimally-invasive surgical abdominal scenes. Because labeled data is often limited in this domain, photorealistic data where ground truth information from the simulated domain is preserved is especially relevant. By extending existing image-based methods to view-consistent videos, we aim to impact the applicability of simulated training and evaluation environments for surgical applications. Code and data: http://opencas.dkfz.de/video-sim2real.