Research Collection School Of Computing and Information Systems

Keyword-driven image captioning via Context-dependent Bilateral LSTM

Xiaodan ZHANG
Shengfeng HE, Singapore Management UniversityFollow
Xinhang SONG
Pengxu WEI
Shuqiang JIANG
Qixiang YE
Jianbin JIAO
Rynson W. H. LAU

Publication Type

Conference Proceeding Article

Publication Date

7-2017

Abstract

Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method would generate various descriptions according to the provided guidance. Hence, descriptions with different focuses can be generated for the same image. Our method is based on a new Context-dependent Bilateral Long Short-Term Memory (CDB-LSTM) model to predict a keyword-driven sentence by considering the word dependence. The word dependence is explored externally with a bilateral pipeline, and internally with a unified and joint training process. Experiments on the MS COCO dataset demonstrate that the proposed approach not only significantly outperforms the baseline method but also shows good adaptation and consistency with various keywords.

Keywords

Image captioning, Keyword-driven, L-STM

Discipline

Artificial Intelligence and Robotics

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of 2017 IEEE International Conference on Multimedia and Expo, Hong Kong, China, July 10-14

First Page

781

Last Page

786

ISBN

9781509060672

Identifier

10.1109/icme.2017.8019525

Publisher

IEEE Computer Society

City or Country

New York, NY, USA

Citation

ZHANG, Xiaodan; HE, Shengfeng; SONG, Xinhang; WEI, Pengxu; JIANG, Shuqiang; YE, Qixiang; JIAO, Jianbin; and LAU, Rynson W. H.. Keyword-driven image captioning via Context-dependent Bilateral LSTM. (2017). Proceedings of 2017 IEEE International Conference on Multimedia and Expo, Hong Kong, China, July 10-14. 781-786.
Available at: https://ink.library.smu.edu.sg/sis_research/8497

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1109/ICME.2017.8019525

This document is currently not available here.

COinS

Research Collection School Of Computing and Information Systems

Keyword-driven image captioning via Context-dependent Bilateral LSTM

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Additional URL

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Keyword-driven image captioning via Context-dependent Bilateral LSTM

Author

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Additional URL

Share

Search

Links

Browse

Links