학술논문

Overview of the Mowjaz Multi-Topic Labelling Task

Document Type

Conference

Author

Al-Ayyoub, Mahmoud; Seelawi, Haitham; Zaghlol, Mohamed; Al-Natsheh, Hussein T.; Suileman, Samer; Fadel, Ali; Badawi, Riham; Morsy, Ahmed; Tuffaha, Ibraheem; Aljarrah, Mohannad

Source

2021 12th International Conference on Information and Communication Systems (ICICS) Information and Communication Systems (ICICS), 2021 12th International Conference on. :502-508 May, 2021

Subject

Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Support vector machines
Recurrent neural networks
Atmospheric measurements
Error analysis
Text categorization
Particle measurements
Natural language processing
Multi-label Text Classification
SVM
RNN
LSTM
GRU
AraVec
Arabic BERT
AraBERT
GigaBERT

Language

ISSN

2573-3346

Abstract

Multilabel text classification is an important task in Natural Language Processing (NLP). One use case of such a task is in categorizing news articles, where each article may belong to one or more classes. In this work, we present the ICICS2021 Mowjaz Multi-Topic Labelling Task. Given a piece of news, systems participating in this task are expected to select its topic(s). The systems are evaluated based on the F1 score measure. In total, 46 teams registered on the task’s CodaLab page. Out of them, 28 teams submitted 309 runs. The results are surprisingly high. Moreover, they are very close to each other with all teams having systems achieving F1 scores ranging between 0.7965 and 0.8567. Most of these systems used deep learning models, such as Recurrent Neural Networks (RNN), coupled with pretrained word embeddings such as BERT-based models. Few of them experimented with traditional machine learning models such as Support Vector Machine (SVM) and Naive Bayes (NB).

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송