학술논문

Development of Character Recognition Model Inspired by Visual Explanations
Document Type
Periodical
Source
IEEE Transactions on Artificial Intelligence IEEE Trans. Artif. Intell. Artificial Intelligence, IEEE Transactions on. 5(3):1362-1372 Mar, 2024
Subject
Computing and Processing
Character recognition
Visualization
Brain modeling
Artificial intelligence
Computational modeling
Task analysis
Gaze tracking
cognitive processes
explainable architecture
eye-tracking
Language
ISSN
2691-4581
Abstract
Deep neural networks (DNNs) currently constitute the best-performing artificial vision systems. However, humans are still better at recognizing many characters, especially distorted, ornamental, or calligraphic characters compared with the highly sophisticated recognition models. Understanding the mechanism of character recognition by humans may give some cues for building better recognition models. However, the appropriate methodological approach to using these cues has not been much explored for developing character recognition models. Therefore, this paper tries to understand the process of character recognition by humans and DNNs by generating visual explanations for their respective decisions. We have used eye tracking to assay the spatial distribution of information hotspots for humans via fixation maps. We have proposed a gradient-based method for visualizing the reasoning behind the model's decision through visualization maps and have proved that our method is better than the other class activation mapping methods. Qualitative comparison between visualization maps and fixation maps reveals that both model and humans focus on similar regions in character in the case of correctly classified characters. However, when the focused regions are different for humans and model, the characters are typically misclassified by the latter. Hence, we propose to use the fixation maps as a supervisory input to train the model that ultimately results in improved recognition performance and better generalization. As the proposed model gives some insights about the reasoning behind its decision, it can find applications in fields, such as surveillance and medical applications, where explainability helps to determine system fidelity.