학술논문

A Mining approach for Automatic Processing of Regulatory Document
Document Type
Conference
Source
2022 IEEE Biennial Congress of Argentina (ARGENCON) Biennial Congress of Argentina (ARGENCON), 2022 IEEE. :1-8 Sep, 2022
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
Geoscience
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Databases
Natural languages
Clustering algorithms
Tagging
Chatbots
Proposals
Document Segmentation
Data Mining
Natural Language Processing
Coherence Detection
Language
Abstract
This paper presents a proposal for automatically detecting the structure of regulatory documents, tagging management and text segmentation in units able to be processed as entries in a database. The information stored in that way must be adequate to be efficiently managed by a chatter bot named PTAH, which aims to answer user questions in Natural Language. As part of this work, the usage, main difficulties and characteristics of managing the information in this context is explained. Then, there is a step by step derivation using the proposed method with a small test set of 27 documents. Finally, the results are analyzed and evaluated. The proposal is flexible and robust, based on simple processing, clustering and a rule-based automatic algorithm.