학술논문

A Text Classification Methodology to Assist a Large Technical Support System
Document Type
Periodical
Source
IEEE Access Access, IEEE. 10:108413-108421 2022
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Natural language processing
Text classification
Information analysis
Artificial intelligence
Feature extraction
Text categorization
Oral communication
Technical management
text classification
technical support
automated assistant
Language
ISSN
2169-3536
Abstract
Text-based tools for reporting technical issues and receiving support are widespread in commercial applications, such as customer services and internal corporate communication. Past issues recorded in such systems may provide valuable knowledge for better handling future interactions. Nevertheless, the predominance of short messages and the presence of specific domain subjects constitute additional challenges. In this work, we aim to build an assistant for a system operating in a large company that provides asynchronous services for technical support. It is known that some repetitive technical issues can be handled with simple standard messages, named templates. Thus, we propose a modular pipeline based on natural language processing and machine learning algorithms to enable raw text processing, feature extraction, and supervised learning to recommend suitable templates from a given textual description of the incoming issue. In a real-world scenario, the proposed pipeline achieved an average accuracy of 72.7%, a promising result for a setup with 9 classes and few labeled training instances. Moreover, a post hoc analysis shows how our methodology is able to correctly identify the words more closely related to the corresponding templates.