Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

1-2023

Abstract

With the tremendous increase in video data size, search performance could be impacted significantly. Specifically, in an interactive system, a real-time system allows a user to browse, search and refine a query. Without a speedy system quickly, the main ingredient to engage a user to stay focused, an interactive system becomes less effective even with a sophisticated deep learning system. This paper addresses this challenge by leveraging approximate search, Bayesian inference, and reinforcement learning. For approximate search, we apply a hierarchical navigable small world, which is an efficient approximate nearest neighbor search algorithm. To quickly prune the search scope, we integrate PicHunter, one of the most popular engines in Video Browser Showdown, with reinforcement learning. The integration enhances PicHunter with the ability of systematic planning. Specifically, PicHunter performs a Bayesian update with a greedy strategy to select a small number of candidates for display. With reinforcement learning, the greedy strategy is replaced with a policy network that learns to select candidates that will result in the minimum number of user iterations, which is analytically defined by a reward function. With these improvements, the interactive system only searches a subset of video datasets relevant to a query while being able to quickly perform Bayesian updates with systematic planning to recommend the most probable candidates that can potentially lead to minimum iteration rounds.

Keywords

Reinforcement learning, Bayesian method, Relevance feedback, Interactive video retrieval

Discipline

Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces

Research Areas

Intelligent Systems and Optimization

Publication

MultiMedia Modeling: 29th International Conference MMM 2023, Bergen, Norway, January 9-12: Proceedings

Volume

13833

First Page

690

Last Page

696

ISBN

9783031270765

Identifier

10.1007/978-3-031-27077-2_60

Publisher

Springer

City or Country

Cham

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1007/978-3-031-27077-2_60

Share

COinS