학술논문
Language-Model-based methods for Vietnamese Single-Document Extractive Summarization
Document Type
Conference
Source
2023 14th International Conference on Information and Communication Technology Convergence (ICTC) Information and Communication Technology Convergence (ICTC), 2023 14th International Conference on. :967-972 Oct, 2023
Subject
Language
ISSN
2162-1241
Abstract
Latterly, there has been a dramatic increase in the amount of text data which demands effective summarization. This paper proposes a method of using English text summarization frameworks and Vietnamese pre-trained language models for Vietnamese single-document extractive summarization. The experiments were conducted with three frameworks namely BertSumExt, MatchSum, and CoLoExt, and two pre-trained language models namely PhoBERT and BartPho. Our models are evaluated on two well-known Vietnamese summarization benchmark datasets, namely Vietnews and Wikilingua, and achieved state-of-the-art results on Vietnews with a maximum ROUGE-1/2/L score are 57.15/26.23/39.76. The results on Wikilingua also show the effectiveness of our methods.