학술논문

Lossless Compression of Textual Images: A Study on Indic Script Documents
Document Type
Conference
Source
18th International Conference on Pattern Recognition (ICPR'06) Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. 3:806-809 2006
Subject
Signal Processing and Analysis
Computing and Processing
Image coding
Image reconstruction
Pattern matching
Dictionaries
Decoding
Optical character recognition software
Image storage
Prototypes
Arithmetic
Libraries
Language
ISSN
1051-4651
Abstract
This paper presents a method for lossless compression of Indian language textual images. The study is an extension of the previously developed pattern matching and substitution (PM&S)-based method for lossy compression of similar images. Here an efficient method for residue coding is proposed and its performance is compared with CCITT Gr-IV and JBIG. A set of 20 text images for two most popular Indic scripts, namely Devanagari (Hindi) and Bengali, is used in the experiment. It is noted that the best results is achieved by PM&S-based approach followed by LZW-based residue coding. This combined scheme gives lossless compression ratio1 of about 37.9.