학술논문

A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges

Document Type

article

Author

Safiullah Faizullah; Muhammad Sohaib Ayub; Sajid Hussain; Muhammad Asad Khan

Source

Applied Sciences, Vol 13, Iss 7, p 4584 (2023)

Subject

optical character recognition
Arabic OCR
preprocessing
segmentation
classification
postprocessing
Technology
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999

Language

English

ISSN

2076-3417
32076002

Abstract

Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize documents for improved productivity and accessibility and for preservation of historical documents. This paper provides a survey of the current state-of-the-art applications, techniques, and challenges in Arabic OCR. We present the existing methods for each step of the complete OCR process to identify the best-performing approach for improved results. This paper follows the keyword-search method for reviewing the articles related to Arabic OCR, including the backward and forward citations of the article. In addition to state-of-art techniques, this paper identifies research gaps and presents future directions for Arabic OCR.

Online Access

Full Text (ProQuest Central) Full Text (Gale Academic Onefile) Open Access (DOAJ) Open Access (EBSCO) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송