Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

6-2020

Abstract

This paper addresses the problem of text-to-image synthesis from a new perspective, i.e., the cause-and-effect chain in image generation. Causality is a common phenomenon in cooking. The dish appearance changes depending on the cooking actions and ingredients. The challenge of synthesis is that a generated image should depict the visual result of action-on-object. This paper presents a new network architecture, CookGAN, that mimics visual effect in causality chain, preserves fine-grained details and progressively upsamples image. Particularly, a cooking simulator sub-network is proposed to incrementally make changes to food images based on the interaction between ingredients and cooking methods over a series of steps. Experiments on Recipe1M verify that CookGAN manages to generate food images with reasonably impressive inception score. Furthermore, the images are semantically interpretable and manipulable.

Discipline

Computer Sciences | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, June 13-19

First Page

5518

Last Page

5526

Identifier

10.1109/CVPR42600.2020.00556

Publisher

IEEE Computer Society

City or Country

Virtual Conference

Citation

ZHU, Bin and NGO, Chong-wah. Cookgan: Causality based text-to-image synthesis. (2020). Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, June 13-19. 5518-5526.
Available at: https://ink.library.smu.edu.sg/sis_research/6484

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

Cookgan: Causality based text-to-image synthesis

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Cookgan: Causality based text-to-image synthesis

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links