학술논문

Generating Description for Possible Collisions in Object Placement Tasks / Nearest Neighbor Future Captioning: 物体配置タスクにおける衝突リスクに関する説明文生成
Document Type
Journal Article
Source
Proceedings of the Annual Conference of JSAI. 2023, :3
Subject
DSRs
Nearest Neighbor future captioning
V&L
Vision and Language
生活支援ロボット
Language
Japanese
ISSN
2758-7347
Abstract
The practical implementation of domestic support robots that can communicate using natural language is a promising solution for those in need of assistance. In particular, the ability to predict potential hazards associated with task execution and to prompt the user for judgment can enhance safety and convenience. However, accurate prediction is difficult because information about future events cannot be utilized. In existing methods, representation of the grasped object is insufficient because the image of the grasped object is not used as input. Additionally, there is a drawback that it is impossible to avoid collision during collision prediction as it requires input of the previous image. In this study, we propose the addition of an attention map visualization module for collision prediction and the enhancement of model representation through the use of k-nearest neighbor method. We conduct comparative experiments using standard evaluation metrics for generated text such as BLEU4, METEOR, ROUGE-L, and CIDEr-D. Experimental results show that the proposed method outperforms the baseline method in all evaluation metrics.

Online Access