학술논문
Synthetic handwritten Odia numeral database: From shallow hundreds to comprehensive thousands
Document Type
Conference
Author
Source
2015 Fifth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG) Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2015 Fifth National Conference on. :1-4 Dec, 2015
Subject
Language
Abstract
Comprehensive database that contains all possible variations of handwriting is crucial for training and recognition. The primary challenge for an optical character recognizer (OCR) is that a number of interclass characters bear structural resemblance whereas images within a class render much dissimilarity. Acquisition of such a large database that ensures robust training of the recognizer is a painstaking task. Therefore, recent research interests have been to create, from a few samples of handwriting, a comprehensive synthetic database which not only ensures naturalness, but provides much needed pattern variability. In this paper, we propose a new approach of synthetic handwritten numeral generation for Odia language using interclass deformation. We experimentally evaluate the generated databases using the state-of-the-art recognition systems. The recognition results are compared on two benchmark databases (ISI Kolkata and IIT Bhubaneswar Odia numeral) as well as two newly created synthetic databases. The Odia numeral database sizes are increased by 20-fold each using our proposed approach. The introduction of nonlinear pattern variance because of interclass deformation is proved to pose better challenge to conventional recognizers. We also experimented on a mixture of original and synthetic database for training the OCR to achieve robustness and higher accuracy.