Research Collection School Of Computing and Information Systems

CONQUER: Contextual query-aware ranking for video corpus moment retrieval

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2021

Abstract

This paper tackles a recently proposed Video Corpus Moment Retrieval task. This task is essential because advanced video retrieval applications should enable users to retrieve a precise moment from a large video corpus. We propose a novel CONtextual QUery-awarE Ranking (CONQUER) model for effective moment localization and ranking. CONQUER explores query context for multi-modal fusion and representation learning in two different steps. The first step derives fusion weights for the adaptive combination of multi-modal video content. The second step performs bi-directional attention to tightly couple video and query as a single joint representation for moment localization. As query context is fully engaged in video representation learning, from feature fusion to transformation, the resulting feature is user-centered and has a larger capacity in capturing multi-modal signals specific to query. We conduct studies on two datasets, TVR for closed-world TV episodes and DiDeMo for open-world user-generated videos, to investigate the potential advantages of fusing video and query online as a joint representation for moment retrieval.

Keywords

cross-modal retrieval, moment localization with natural language

Discipline

Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the 29th ACM International Conference on Multimedia, MM 2021, Virtual Conference, 2021 October 20-24

First Page

3900

Last Page

3908

ISBN

9781450386517

Identifier

10.1145/3474085.3475281

Publisher

Association for Computing Machinery, Inc

City or Country

Virtual Conference

Citation

HOU, Zhijian; NGO, Chong-Wah; and CHAN, W. K.. CONQUER: Contextual query-aware ranking for video corpus moment retrieval. (2021). Proceedings of the 29th ACM International Conference on Multimedia, MM 2021, Virtual Conference, 2021 October 20-24. 3900-3908.
Available at: https://ink.library.smu.edu.sg/sis_research/6789

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

CONQUER: Contextual query-aware ranking for video corpus moment retrieval

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

CONQUER: Contextual query-aware ranking for video corpus moment retrieval

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links