학술논문
MasakhaNEWS: News Topic Classification for African languages
Document Type
Working Paper
Author
Adelani, David Ifeoluwa; Masiak, Marek; Azime, Israel Abebe; Alabi, Jesujoba; Tonja, Atnafu Lambebo; Mwase, Christine; Ogundepo, Odunayo; Dossou, Bonaventure F. P.; Oladipo, Akintunde; Nixdorf, Doreen; Emezue, Chris Chinenye; al-azzawi, sana; Sibanda, Blessing; David, Davis; Ndolela, Lolwethu; Mukiibi, Jonathan; Ajayi, Tunde; Moteu, Tatiana; Odhiambo, Brian; Owodunni, Abraham; Obiefuna, Nnaemeka; Mohamed, Muhidin; Muhammad, Shamsuddeen Hassan; Ababu, Teshome Mulugeta; Salahudeen, Saheed Abdullahi; Yigezu, Mesay Gemeda; Gwadabe, Tajuddeen; Abdulmumin, Idris; Taye, Mahlet; Awoyomi, Oluwabusayo; Shode, Iyanuoluwa; Adelani, Tolulope; Abdulganiyu, Habiba; Omotayo, Abdul-Hakeem; Adeeko, Adetola; Afolabi, Abeeb; Aremu, Anuoluwapo; Samuel, Olanrewaju; Siro, Clemencia; Kimotho, Wangari; Ogbu, Onyekachi; Mbonu, Chinedu; Chukwuneke, Chiamaka; Fanijo, Samuel; Ojo, Jessica; Awosan, Oyinkansola; Kebede, Tadesse; Sakayo, Toadoum Sari; Nyatsine, Pamela; Sidume, Freedmore; Yousuf, Oreen; Oduwole, Mardiyyah; Tshinu, Tshinu; Kimanuka, Ussen; Diko, Thina; Nxakama, Siyanda; Nigusse, Sinodos; Johar, Abdulmejid; Mohamed, Shafie; Hassan, Fuad Mire; Mehamed, Moges Ahmed; Ngabire, Evrard; Jules, Jules; Ssenkungu, Ivan; Stenetorp, Pontus
Source
Subject
Language
Abstract
African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named entity recognition and machine translation) have standardized benchmark datasets covering several geographical and typologically-diverse African languages. In this paper, we develop MasakhaNEWS -- a new benchmark dataset for news topic classification covering 16 languages widely spoken in Africa. We provide an evaluation of baseline models by training classical machine learning models and fine-tuning several language models. Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API). Our evaluation in zero-shot setting shows the potential of prompting ChatGPT for news topic classification in low-resource African languages, achieving an average performance of 70 F1 points without leveraging additional supervision like MAD-X. In few-shot setting, we show that with as little as 10 examples per label, we achieved more than 90\% (i.e. 86.0 F1 points) of the performance of full supervised training (92.6 F1 points) leveraging the PET approach.
Comment: Accepted to IJCNLP-AACL 2023 (main conference)
Comment: Accepted to IJCNLP-AACL 2023 (main conference)