Research Collection School Of Computing and Information Systems

Exploring duality in visual question-driven top-down saliency

Shengfeng HE, Singapore Management UniversityFollow
Chu HAN
Guoqiang HAN
Jing QIN

Publication Type

Journal Article

Publication Date

7-2020

Abstract

Top-down, goal-driven visual saliency exerts a huge influence on the human visual system for performing visual tasks. Text generations, like visual question answering (VQA) and visual question generation (VQG), have intrinsic connections with top-down saliency, which is usually involved in both VQA and VQG processes in an unsupervised manner. However, it is shown that the regions that humans choose to look at to answer questions are very different from the unsupervised attention models. In this brief, we aim to explore the intrinsic relationship between top-down saliency and text generations, and to figure out whether an accurate saliency response benefits text generation. To this end, we propose a dual supervised network with dynamic parameter prediction. Dual-supervision explicitly exploits the probabilistic correlation between the primal task top-down saliency detection and the dual task text generation, while dynamic parameter prediction encodes the given text (i.e., question or answer) into the fully convolutional network. Extensive experiments show the proposed top-down saliency method achieves the best correlation with human attention among various baselines. In addition, the proposed model can be guided by either questions or answers, and output the counterpart. Furthermore, we show that combining human-like visual question-saliency improves the performance of both answer and question generations.

Keywords

Task analysis, Visualization, Feature extraction, Training, Pipelines, Learning systems, Knowledge discovery, Dual learning, saliency, visual question answering (VQA), visual question generation (VQG)

Discipline

Information Security

Research Areas

Intelligent Systems and Optimization

Publication

IEEE Transactions on Neural Networks and Learning Systems

Volume

Issue

First Page

2672

Last Page

2679

ISSN

2162-237X

Identifier

10.1109/TNNLS.2019.2933439

Publisher

Institute of Electrical and Electronics Engineers

Citation

HE, Shengfeng; HAN, Chu; HAN, Guoqiang; and QIN, Jing. Exploring duality in visual question-driven top-down saliency. (2020). IEEE Transactions on Neural Networks and Learning Systems. 31, (7), 2672-2679.
Available at: https://ink.library.smu.edu.sg/sis_research/7857

Additional URL

https://doi.org/10.1109/TNNLS.2019.2933439

This document is currently not available here.

COinS

Research Collection School Of Computing and Information Systems

Exploring duality in visual question-driven top-down saliency

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Additional URL

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Exploring duality in visual question-driven top-down saliency

Author

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Additional URL

Share

Search

Links

Browse

Links