학술논문

Multimodal Attention Branch Networkに基づく把持命令文の生成 / Sentence Generation for Fetching Instruction based on Multimodal Attention Branch Network

Document Type

Journal Article

Author

Aly MAGASSOUBA; Hironobu FUJIYOSHI; Hisashi KAWAI; Komei SUGIURA; MAGASSOUBA Aly; Tadashi OGURA; Takayoshi YAMASHITA; Tsubasa HIRAKAWA; 小椋忠志; 山下隆義; 平川翼; 杉浦孔明; 河井恒; 藤吉弘亘

Source

Proceedings of the Annual Conference of JSAI. 2020, :1

Subject

Domestic service robot
Multimodal language generation
マルチモーダル言語生成
生活支援ロボット

Language

Japanese

Abstract

Domestic service robots (DSRs) are a promising solution to the shortage of home care workers. Nonetheless, one of the main limitations of DSRs is their inability to naturally interact through language. Recently, data-driven approaches have been shown to be effective for tackling this limitation, however, they often require large-scale datasets, which is costly. Based on this background, we aim to perform automatic sentence generation for fetching instructions, e.g., ``Bring me a green tea bottle on the table.'' This is particularly challenging because appropriate expressions depend on the target object, as well as its surroundings. In this paper, we propose a method that generates sentences from visual inputs. Unlike other approaches, the proposed method has multimodal attention branches that utilize subword-level attention and generate sentences based on subword embeddings. In the experiment, we compared the proposed method with a baseline method using four standard metrics in image captioning. Experimental results show that the proposed method outperformed the baseline in terms of these metrics.

Online Access

Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송