Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
1-2023
Abstract
With the tremendous increase in video data size, search performance could be impacted significantly. Specifically, in an interactive system, a real-time system allows a user to browse, search and refine a query. Without a speedy system quickly, the main ingredient to engage a user to stay focused, an interactive system becomes less effective even with a sophisticated deep learning system. This paper addresses this challenge by leveraging approximate search, Bayesian inference, and reinforcement learning. For approximate search, we apply a hierarchical navigable small world, which is an efficient approximate nearest neighbor search algorithm. To quickly prune the search scope, we integrate PicHunter, one of the most popular engines in Video Browser Showdown, with reinforcement learning. The integration enhances PicHunter with the ability of systematic planning. Specifically, PicHunter performs a Bayesian update with a greedy strategy to select a small number of candidates for display. With reinforcement learning, the greedy strategy is replaced with a policy network that learns to select candidates that will result in the minimum number of user iterations, which is analytically defined by a reward function. With these improvements, the interactive system only searches a subset of video datasets relevant to a query while being able to quickly perform Bayesian updates with systematic planning to recommend the most probable candidates that can potentially lead to minimum iteration rounds.
Keywords
Reinforcement learning, Bayesian method, Relevance feedback, Interactive video retrieval
Discipline
Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
MultiMedia Modeling: 29th International Conference MMM 2023, Bergen, Norway, January 9-12: Proceedings
Volume
13833
First Page
690
Last Page
696
ISBN
9783031270765
Identifier
10.1007/978-3-031-27077-2_60
Publisher
Springer
City or Country
Cham
Citation
MA, Zhixin; WU, Jiaxin; LOO, Weixiong; and NGO, Chong-wah.
Reinforcement learning enhanced PicHunter for interactive search. (2023). MultiMedia Modeling: 29th International Conference MMM 2023, Bergen, Norway, January 9-12: Proceedings. 13833, 690-696.
Available at: https://ink.library.smu.edu.sg/sis_research/7819
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/978-3-031-27077-2_60
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons