Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2020

Abstract

Answering query with semantic concepts has long been the mainstream approach for video search. Until recently, its performance is surpassed by concept-free approach, which embeds queries in a joint space as videos. Nevertheless, the embedded features as well as search results are not interpretable, hindering subsequent steps in video browsing and query reformulation. This paper integrates feature embedding and concept interpretation into a neural network for unified dual-task learning. In this way, an embedding is associated with a list of semantic concepts as an interpretation of video content. This paper empirically demonstrates that, by using either the embedding features or concepts, considerable search improvement is attainable on TRECVid benchmarked datasets. Concepts are not only effective in pruning false positive videos, but also highly complementary to concept-free search, leading to large margin of improvement compared to state-of-the-art approaches.

Keywords

ad-hoc video search, concept-based search, concept-free search, interpretable video search

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the 28th ACM International Conference on Multimedia, MM 2020, Seattle, October 12–16

First Page

3357

Last Page

3366

ISBN

9781450379885

Identifier

10.1145/3394171.3413916

Publisher

Association for Computing Machinery, Inc

City or Country

Virtual Conference

Share

COinS