학술논문

Convolutional Neural Network for 3D object recognition using volumetric representation
Document Type
Conference
Source
2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE) Sensing, Processing and Learning for Intelligent Machines (SPLINE), 2016 First International Workshop on. :1-5 Jul, 2016
Subject
Computing and Processing
Three-dimensional displays
Training
Solid modeling
Object recognition
Two dimensional displays
Computational modeling
Image resolution
Language
Abstract
Following the success of Convolutional Neural Networks (CNNs) on object recognition using 2D images, they are extended in this paper to process 3D data. Nearly most of current systems require huge amount of computation for dealing with large amount of data. In this paper, an efficient 3D volumetric object representation, Volumetric Accelerator (VOLA), is presented which requires much less memory than the normal volumetric representations. On this basis, a few 3D digit datasets using 2D MNIST and 2D digit fonts with different rotations along the x, y, and z axis are introduced. Finally, we introduce a combination of multiple CNN models based on the famous LeNet model. The trained CNN models based on the generated dataset have achieved the average accuracy of 90.30% and 81.85% for 3D-MNIST and 3D-Fonts datasets, respectively. Experimental results show that VOLA-based CNNs perform 1.5x faster than the original LeNet.