Keyword-driven image captioning via Context-dependent Bilateral LSTM
Publication Type
Conference Proceeding Article
Publication Date
7-2017
Abstract
Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method would generate various descriptions according to the provided guidance. Hence, descriptions with different focuses can be generated for the same image. Our method is based on a new Context-dependent Bilateral Long Short-Term Memory (CDB-LSTM) model to predict a keyword-driven sentence by considering the word dependence. The word dependence is explored externally with a bilateral pipeline, and internally with a unified and joint training process. Experiments on the MS COCO dataset demonstrate that the proposed approach not only significantly outperforms the baseline method but also shows good adaptation and consistency with various keywords.
Keywords
Image captioning, Keyword-driven, L-STM
Discipline
Artificial Intelligence and Robotics
Research Areas
Software and Cyber-Physical Systems
Publication
Proceedings of 2017 IEEE International Conference on Multimedia and Expo, Hong Kong, China, July 10-14
First Page
781
Last Page
786
ISBN
9781509060672
Identifier
10.1109/icme.2017.8019525
Publisher
IEEE Computer Society
City or Country
New York, NY, USA
Citation
ZHANG, Xiaodan; HE, Shengfeng; SONG, Xinhang; WEI, Pengxu; JIANG, Shuqiang; YE, Qixiang; JIAO, Jianbin; and LAU, Rynson W. H..
Keyword-driven image captioning via Context-dependent Bilateral LSTM. (2017). Proceedings of 2017 IEEE International Conference on Multimedia and Expo, Hong Kong, China, July 10-14. 781-786.
Available at: https://ink.library.smu.edu.sg/sis_research/8497
Copyright Owner and License
Authors
Additional URL
https://doi.org/10.1109/ICME.2017.8019525