학술논문

딥러닝을 이용한 영상의 기하학적 분석 / Geometric analysis on image with deep learning
Document Type
Dissertation/ Thesis
Source
Subject
딥러닝
영상
카메라 캘리브레이션
선분 추출
와이어프레임
Language
English
Abstract
In this dissertion, we study geometric analysis on image with deep learning. First, we study the problem of single image camera calibration for man-made scenes. Unlike previous neural approaches, which rely only on semantic cues obtained from neural networks, our approach considers both semantic and geometric cues, eventually resulting in significant accuracy improvements. From the supervision of datasets consisting of the horizontal line and focal length of the images, our networks can be trained to estimate the same camera parameters. Based on the Manhattan world assumption, we can further estimate camera rotation and focal length in a weakly-supervised manner. Second, we study the problem of line segment detection in an image.Our deep line segment detector is designed to take advantage of two different but complementary spaces using two subnetworks: a pixel analysis network and a line segment detection network. The pixel analysis network extracts evidence for line segments in the pixel space, while the line segment detection network finds the parameters of line segments in the parameter space exploiting pixel-space evidence. Third, we study the problem of wireframe parsing in an image. We propose a simple but effective neural network called DeRasterNet that holistically transform an input image into structured line segments. The proposed network directly derasterizes the input image into multiple line segments in each patch using the backbone ConvNet features. An estimated line segment is parameterized with the 2D coordinates of two endpoints, and the patch-wise derasterization loss is measured with the mean squares error between the rasterization of the distance fields for the estimated line segments and the rasterization of ground truth line segments in the patch. Experimental results show that the effectiveness of the proposed network.
In this dissertion, we study geometric analysis on image with deep learning. First, we study the problem of single image camera calibration for man-made scenes. Unlike previous neural approaches, which rely only on semantic cues obtained from neural networks, our approach considers both semantic and geometric cues, eventually resulting in significant accuracy improvements. From the supervision of datasets consisting of the horizontal line and focal length of the images, our networks can be trained to estimate the same camera parameters. Based on the Manhattan world assumption, we can further estimate camera rotation and focal length in a weakly-supervised manner. Second, we study the problem of line segment detection in an image.Our deep line segment detector is designed to take advantage of two different but complementary spaces using two subnetworks: a pixel analysis network and a line segment detection network. The pixel analysis network extracts evidence for line segments in the pixel space, while the line segment detection network finds the parameters of line segments in the parameter space exploiting pixel-space evidence. Third, we study the problem of wireframe parsing in an image. We propose a simple but effective neural network called DeRasterNet that holistically transform an input image into structured line segments. The proposed network directly derasterizes the input image into multiple line segments in each patch using the backbone ConvNet features. An estimated line segment is parameterized with the 2D coordinates of two endpoints, and the patch-wise derasterization loss is measured with the mean squares error between the rasterization of the distance fields for the estimated line segments and the rasterization of ground truth line segments in the patch. Experimental results show that the effectiveness of the proposed network.