학술논문

Weighted Token-Level Virtual Adversarial Training in Text Classification
Document Type
Conference
Source
2022 3rd International Conference on Pattern Recognition and Machine Learning (PRML) Pattern Recognition and Machine Learning (PRML), 2022 3rd International Conference on. :117-123 Jul, 2022
Subject
Computing and Processing
Signal Processing and Analysis
Training
Text recognition
Perturbation methods
Bit error rate
Text categorization
Training data
Benchmark testing
natural language processing
virtual adversarial training
transformer
text classification
Language
Abstract
Text Classification is the process of classifying text into categories. Among the contextualized architecture of the pretrained model proposed, the Bidirectional Encoder Representations from Transformers (BERT) helps models learn bidirectional contexts of words, making it possible to classify text much more efficiently and accurately. Although BERT and its variance have led to impressive gains on many natural language processing (NLP) tasks, one of the problems of BERT is overfitting. When training data is limited, BERT model overemphasizes certain words and ignores the whole context of the sentence. This makes it difficult for the model to make accurate predictions on the test data. In this paper, we propose weighted token-level virtual adversarial training, which combines two-level perturbations: (1) the sentence-level perturbation and (2) the weighted token perturbation to create a more granular perturbation than traditional virtual adversarial training with only sentence-level perturbation. Our approach can help models learn more about the key and important tokens in sentences when trained with virtual adversarial examples. The experiments in the General Language Understanding Evaluation (GLUE) benchmark showed that our approach can achieve the average score of 79.5%, which outperforms BERT base model. Our approach can also reduce the overfitting problem especially when datasets are small.