Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
10-2020
Abstract
The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, nevertheless, is the difficulty in applying an already-trained model to recognize different cuisines of dishes unknown to the model. In general, model updating with new training examples, in the form of image-recipe pairs, is required to adapt a model to new cooking styles in a cuisine. Nevertheless, in practice, acquiring sufficient number of image-recipe pairs for model transfer can be time-consuming. This paper addresses the challenge of resource scarcity in the scenario that only partial data instead of a complete view of data is accessible for model transfer. Partial data refers to missing information such as absence of image modality or cooking instructions from an image-recipe pair. To cope with partial data, a novel generic model, equipped with various loss functions including cross-modal metric learning, recipe residual loss, semantic regularization and adversarial learning, is proposed for cross-domain transfer learning. Experiments are conducted on three different cuisines (Chuan, Yue and Washoku) to provide insights on scaling up food recognition across domains with limited training resources.
Keywords
cross-domain transfer, cross-modal food retrieval, food recognition
Discipline
Databases and Information Systems | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 28th ACM International Conference on Multimedia, MM 2020, Seattle, October 12–16
First Page
3762
Last Page
3770
ISBN
9781450379885
Identifier
10.1145/3394171.3413809
Publisher
Association for Computing Machinery, Inc
City or Country
Virtual Conference
Citation
ZHU, Bin; NGO, Chong-wah; and CHEN, Jingjing.
Cross-domain cross-modal food transfer. (2020). Proceedings of the 28th ACM International Conference on Multimedia, MM 2020, Seattle, October 12–16. 3762-3770.
Available at: https://ink.library.smu.edu.sg/sis_research/6497
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Included in
Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons