Publication Type
Journal Article
Version
acceptedVersion
Publication Date
6-2022
Abstract
Recipe generation from food images and ingredients is a challenging task, which requires the interpretation of the information from another modality. Different from the image captioning task, where the captions usually have one sentence, cooking instructions contain multiple sentences and have obvious structures. To help the model capture the recipe structure and avoid missing some cooking details, we propose a novel framework: Decomposing Generation Networks (DGN) with structure prediction, to get more structured and complete recipe generation outputs. Specifically, we split each cooking instruction into several phases, and assign different sub-generators to each phase. Our approach includes two novel ideas: (i) learning the recipe structures with the global structure prediction component and (ii) producing recipe phases in the sub-generator output component based on the predicted structure. Extensive experiments on the challenging large-scale Recipe1M dataset validate the effectiveness of our proposed model, which improves the performance over the state-of-the-art results.
Keywords
Text generation, Vision-and-language
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Data Science and Engineering
Publication
Pattern Recognition
Volume
126
First Page
1
Last Page
9
ISSN
0031-3203
Identifier
10.1016/j.patcog.2022.108578
Publisher
Elsevier
Citation
WANG, Hao; LIN, Guosheng; HOI, Steven C. H.; and MIAO, Chunyan.
Decomposing generation networks with structure prediction for recipe generation. (2022). Pattern Recognition. 126, 1-9.
Available at: https://ink.library.smu.edu.sg/sis_research/6962
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1016/j.patcog.2022.108578
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons