학술논문

Synthetic handwritten Odia numeral database: From shallow hundreds to comprehensive thousands
Document Type
Conference
Source
2015 Fifth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG) Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2015 Fifth National Conference on. :1-4 Dec, 2015
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Databases
Handwriting recognition
Training
Shape
Optical character recognition software
Robustness
Synthetic database
interclass deformation
OCR
handwritten numeral synthesis
evaluation
Language
Abstract
Comprehensive database that contains all possible variations of handwriting is crucial for training and recognition. The primary challenge for an optical character recognizer (OCR) is that a number of interclass characters bear structural resemblance whereas images within a class render much dissimilarity. Acquisition of such a large database that ensures robust training of the recognizer is a painstaking task. Therefore, recent research interests have been to create, from a few samples of handwriting, a comprehensive synthetic database which not only ensures naturalness, but provides much needed pattern variability. In this paper, we propose a new approach of synthetic handwritten numeral generation for Odia language using interclass deformation. We experimentally evaluate the generated databases using the state-of-the-art recognition systems. The recognition results are compared on two benchmark databases (ISI Kolkata and IIT Bhubaneswar Odia numeral) as well as two newly created synthetic databases. The Odia numeral database sizes are increased by 20-fold each using our proposed approach. The introduction of nonlinear pattern variance because of interclass deformation is proved to pose better challenge to conventional recognizers. We also experimented on a mixture of original and synthetic database for training the OCR to achieve robustness and higher accuracy.