학술논문

Cross-Modal Recipe Embeddings by Disentangling Recipe Contents and Dish Styles

Document Type

Conference

Author

Sugiyama, Yu; Yanai, Keiji

Source

Proceedings of the 29th ACM International Conference on Multimedia. :2501-2509

Subject

GAN
cross-modal search
feature disentanglement
recipe embedding

Language

English

Abstract

Nowadays, cooking recipe sharing sites on the Web are widely used, and play a major role in everyday home cooking. Since cooking recipes consist of dish photos and recipe texts, cross-modal recipe search is being actively explored. To enable cross-modal search, both food image features and cooking text recipe features are embedded into the same shared space in general. However, in most of the existing studies, a one-to-one correspondence between a recipe text and a dish image in the embedding space is assumed, although an unlimited number of photos with different serving styles and different plates can be associated with the same recipe. In this paper, we propose a RDE-GAN (Recipe Disentangled Embedding GAN) which separates food image information into a recipe image feature and a non-recipe shape feature. In addition, we generate a food image by integrating both the recipe embedding and a shape feature. Since the proposed embedding is free from serving and plate styles which are unrelated to cooking recipes, the experimental results showed that it outperformed the existing methods on cross-modal recipe search. We also confirmed that only either shape or recipe elements can be changed at the time of food image generation.

Online Access

Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송