학술논문

Language-Model-based methods for Vietnamese Single-Document Extractive Summarization

Document Type

Conference

Author

Cuong, Vo Le; Thinh, Vu Xuan; Thien, Vu Duy; Hung, Vo Sy; Viet, Nguyen Xuan

Source

2023 14th International Conference on Information and Communication Technology Convergence (ICTC) Information and Communication Technology Convergence (ICTC), 2023 14th International Conference on. :967-972 Oct, 2023

Subject

Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Fields, Waves and Electromagnetics
Power, Energy and Industry Applications
Signal Processing and Analysis
Transportation
Benchmark testing
Information and communication technology
Data mining
Task analysis
Convergence
extractive text summarization
pre-trained language model
single-document

Language

ISSN

2162-1241

Abstract

Latterly, there has been a dramatic increase in the amount of text data which demands effective summarization. This paper proposes a method of using English text summarization frameworks and Vietnamese pre-trained language models for Vietnamese single-document extractive summarization. The experiments were conducted with three frameworks namely BertSumExt, MatchSum, and CoLoExt, and two pre-trained language models namely PhoBERT and BartPho. Our models are evaluated on two well-known Vietnamese summarization benchmark datasets, namely Vietnews and Wikilingua, and achieved state-of-the-art results on Vietnews with a maximum ROUGE-1/2/L score are 57.15/26.23/39.76. The results on Wikilingua also show the effectiveness of our methods.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송