학술논문

Evaluating Transformer-based Models in the Information Extraction of Fiscal Documents
Document Type
Conference
Source
2023 IEEE Latin American Conference on Computational Intelligence (LA-CCI) Computational Intelligence (LA-CCI), 2023 IEEE Latin American Conference on. :1-6 Oct, 2023
Subject
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Computational modeling
Information retrieval
Transformers
Data models
Complexity theory
Data mining
Optimization
Key Information Extraction
Visually Rich Documents
Fiscal Documents
Document Understanding
Language
ISSN
2769-7622
Abstract
The use of artificial intelligence to extract information from documents is essential for process automation and information mining. In this context, fiscal documents are one essential target due to their high volume, variety, complexity, and inconsistent document structure. In this work, we perform a comparative analysis between 3 state-of-the-art key information extraction models. We assess the performance of LAMBERT, LayoutLM, and LayoutLMv2 models considering different metrics. The models were evaluated on the CORD, Brazilian Invoice Dataset, and Brazilian Receipt Datasets. LayoutLMv2 showed the best performance but also the slower inference time and bigger model size. LayoutLM performed slight better than LAMBERT, but in most scenarios they are exchangeable performance-wise.