학술논문
A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges
Document Type
article
Source
Applied Sciences, Vol 13, Iss 7, p 4584 (2023)
Subject
Language
English
ISSN
2076-3417
32076002
32076002
Abstract
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize documents for improved productivity and accessibility and for preservation of historical documents. This paper provides a survey of the current state-of-the-art applications, techniques, and challenges in Arabic OCR. We present the existing methods for each step of the complete OCR process to identify the best-performing approach for improved results. This paper follows the keyword-search method for reviewing the articles related to Arabic OCR, including the backward and forward citations of the article. In addition to state-of-art techniques, this paper identifies research gaps and presents future directions for Arabic OCR.