학술논문

PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models' features for offensive language recognition

Document Type

Working Paper

Author

Janiszewski, Piotr; Skiba, Mateusz; Walińska, Urszula

Source

Subject

Computer Science - Computation and Language

Language

Abstract

In this paper, we describe the PUM team's entry to the SemEval-2020 Task 12. Creating our solution involved leveraging two well-known pretrained models used in natural language processing: BERT and XLNet, which achieve state-of-the-art results in multiple NLP tasks. The models were fine-tuned for each subtask separately and features taken from their hidden layers were combined and fed into a fully connected neural network. The model using aggregated Transformer features can serve as a powerful tool for offensive language identification problem. Our team was ranked 7th out of 40 in Sub-task C - Offense target identification with 64.727% macro F1-score and 64th out of 85 in Sub-task A - Offensive language identification (89.726% F1-score).
Comment: 7 pages, 0 figures. Proceedings of the International Workshop on Semantic Evaluation (SemEval-2020)

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송