학술논문

Toward Automatic Recognition of Cursive Chinese Calligraphy : An Open Dataset For Cursive Chinese Calligraphy Text
Document Type
Conference
Source
2020 14th International Conference on Ubiquitous Information Management and Communication (IMCOM) Ubiquitous Information Management and Communication (IMCOM), 2020 14th International Conference on. :1-5 Jan, 2020
Subject
Communication, Networking and Broadcast Technologies
Robotics and Control Systems
Training
Neural networks
Text recognition
Databases
Writing
Task analysis
Machine learning
Cursive Chinese Calligraphy
Text Recognition
Deep Learning
Language
Abstract
Calligraphy is one of the most important writing tools as well as cultural heritage in ancient China. Compared with other calligraphy styles, the cursive script is least restricted and oftentimes exhibits the personality of calligraphers. However, this style-oriented expression makes the cursive script hard to recognize even for trained experts. The call for auxiliary tools for cursive Chinese calligraphy text recognition has thus arisen.Data play a key role in the era of deep learning, yet there is a lack of open databases for the cursive Chinese calligraphy. In this paper, we address this discrepancy by collecting 43000 images consisting of 5301 different cursive Chinese calligraphy text. We have augmented the database with basic image processing operations to obtain a training set containing a total of 656K images. After experimenting with several deep neural architectures, we provided a baseline model Enhanced M6 (EM6) as a proof-of-concept to tackle the classification task. The proposed EM6 model achieved 60.3% top-1 accuracy and 80.8% top-5 accuracy on the evaluation data set, an indication that deep neural network has the potential to undertake the mission of cursive calligraphy recognition.