학술논문

Abstractive Multi-Document Summarization Using Sentence Fusion
Document Type
Conference
Source
2023 International Conference on Information Technology (ICIT) Information Technology (ICIT), 2023 International Conference on. :734-741 Aug, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Deep learning
Fuses
Focusing
Benchmark testing
Transformers
Natural language processing
Data mining
Abstractive Summarization
Deep Learning
Text-To-Text transformer
Sentence Fusion
Kullback-Leibler Divergence
Language
ISSN
2831-3399
Abstract
Multi-document abstractive text summarization aims to generate a gist focusing on salient concepts in the set of related documents. Abstracting is such a human ability that it is difficult to model. Thus automatic abstracting is tough and it is considered one of the most difficult tasks in natural language processing (NLP). Over the past few years with the advancement of deep learning technology in the natural language processing (NLP) domain, abstractive text summarization also has made significant progress using a text-to-text generative model. One of the drawbacks of the text-to-text generative model is its input length restriction, which limits the amount of text that can be processed at once. This can pose challenges when summarizing multiple text documents as a whole. To address this issue, In this paper, we present an approach that uses an unsupervised extractive summarization model for creating an extract which is passed to a deep learning-based sentence fusion model to fuse extracted sentences. Finally, it uses a KL divergence-based method to select a subset of fused sentences to create a summary that maximizes the information content. We have trained the proposed model on a benchmark multi-document text summarization dataset called multinews dataset and tested the trained model on the DUC2004 dataset. Experimental results show that the proposed approach outperforms the state-of-the-art systems.