학술논문

A Proposal of Response Continuity Prediction Model for Attentive Listening Agents / 傾聴対話システムのための返答継続性の予測モデルの提案
Document Type
Journal Article
Source
人工知能学会論文誌 / Transactions of the Japanese Society for Artificial Intelligence. 2023, 38(4):C-1
Subject
attentive listening agents
domain knowledge
multimodality
naturally occurring data
response continuity
Language
Japanese
ISSN
1346-0714
1346-8030
Abstract
For the current attentive listening agents, the main way to improve the system performance is to design the format of output utterances. However, to achieve better attentive listening performance, it is also important to predict when to start and to end attentive listening. In fact, predicting the timing of attentive listening can be also seen as predicting whether or not the speaker will continue to speak in an occupied multi-unit turn. In this paper, we propose a deep learning model to predict whether or not a speaker will continue speaking in an attentive listening dialog. We focus on the situation in which the respondent continues to respond to a question by the interlocutor or completes the response being produced. Our model has the following three features. First, the input data of our model is designed using domain knowledge about the structure of attentive listening dialog and the characteristics of the vocabulary used therein. Second, we use multimodal information such as text, acoustic features, and characteristic tokens when constructing the model. Third, considering the practicality in actual daily dialogs, we use everyday conversation data collected from a large-scale corpus for constructing the model. The experimental results show that the proposed model achieves the best prediction performance among the models we examine, providing a great potential for prediction of the timing of attentive listening for dialog agents.