Research Collection School Of Computing and Information Systems

Cross-modal recipe retrieval: How to cook this dish?

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

1-2017

Abstract

In social media users like to share food pictures. One intelligent feature, potentially attractive to amateur chefs, is the recommendation of recipe along with food. Having this feature, unfortunately, is still technically challenging. First, the current technology in food recognition can only scale up to few hundreds of categories, which are yet to be practical for recognizing ten of thousands of food categories. Second, even one food category can have variants of recipes that differ in ingredient composition. Finding the best-match recipe requires knowledge of ingredients, which is a fine-grained recognition problem. In this paper, we consider the problem from the viewpoint of cross-modality analysis. Given a large number of image and recipe pairs acquired from the Internet, a joint space is learnt to locally capture the ingredient correspondence from images and recipes. As learning happens at the region level for image and ingredient level for recipe, the model has ability to generalize recognition to unseen food categories. Furthermore, the embedded multi-modal ingredient feature sheds light on the retrieval of best-match recipes. On an in-house dataset, our model can double the retrieval performance of DeViSE, a popular cross-modality model but not considering region information during learning.

Keywords

Cross-modal retrieval, Multi-modality embedding, Recipe retrieval

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6: Proceedings

Volume

10132

First Page

588

Last Page

600

ISBN

9783319518107

Identifier

10.1007/978-3-319-51811-4_48

Publisher

Springer

City or Country

Cham

Citation

CHEN, Jingjing; PANG, Lei; and NGO, Chong-wah. Cross-modal recipe retrieval: How to cook this dish?. (2017). MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6: Proceedings. 10132, 588-600.
Available at: https://ink.library.smu.edu.sg/sis_research/6674

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1007/978-3-319-51811-4_48

Download

Find it in your library

Included in

Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

Cross-modal recipe retrieval: How to cook this dish?

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Cross-modal recipe retrieval: How to cook this dish?

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links