Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2025
Abstract
Nutrition estimation is an important component of promoting healthy eating and mitigating diet-related health risks. Despite advances in tasks such as food classification and ingredient recognition, progress in nutrition estimation is limited due to the lack of datasets with nutritional annotations. To address this issue, we introduce FastFood, a dataset with 84,446 images across 908 fast food categories, featuring ingredient and nutritional annotations. In addition, we propose a new model-agnostic Visual-Ingredient Feature Fusion (VIF2 ) method to enhance nutrition estimation by integrating visual and ingredient features. Ingredient robustness is improved through synonym replacement and resampling strategies during training. The ingredient-aware visual feature fusion module combines ingredient features and visual representation to achieve accurate nutritional prediction. During testing, ingredient predictions are refined using large multimodal models by data augmentation and majority voting. Our experiments on both FastFood and Nutrition5k datasets validate the effectiveness of our proposed method built in different backbones (e.g., Resnet, InceptionV3 and ViT), which demonstrates the importance of ingredient information in nutrition estimation. https://huiyanqi.github.io/fastfood-nutrition-estimation/.
Keywords
Nutrition estimation, ingredient recognition, dataset
Discipline
Data Storage Systems | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the 2025 International Conference on Multimedia Retrieval, Chicago, IL, USA, June 30 - July 3
First Page
1091
Last Page
1099
Identifier
10.1145/3731715.3733269
Publisher
ACM
City or Country
New York
Citation
QI, Huiyan; ZHU, Bin; NGO, Chong-wah; CHEN, Jingjing; and LIM, Ee-peng.
Advancing food nutrition estimation via visual-ingredient feature fusion. (2025). Proceedings of the 2025 International Conference on Multimedia Retrieval, Chicago, IL, USA, June 30 - July 3. 1091-1099.
Available at: https://ink.library.smu.edu.sg/sis_research/10384
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3731715.3733269