Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2007
Abstract
Near-duplicate keyframe retrieval is a critical task for video similarity measure, video threading and tracking. In this paper, instead of using expensive point-to-point matching on keypoints, we investigate the visual language models built on visual keywords to speed up the near-duplicate keyframe retrieval. The main idea is to estimate a visual language model on visual keywords for each keyframe and compare keyframes by the likelihood of their visual language models. Experiments on a subset of TRECVID-2004 video corpus show that visual language models built on visual keywords demonstrate promising performance for near-duplicate keyframe retrieval, which greatly speed up the retrieval speed although sacrifice a little performance compared to expensive point-to-point matching.
Discipline
Databases and Information Systems | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of 2007 International Conference on Multimedia & Expo, Beijing July 2-5
First Page
500
Last Page
503
ISBN
9781424410170
Identifier
10.1109/icme.2007.4284696
Publisher
IEEE Computer Society
City or Country
Beijing, China
Citation
WU, Xiao; ZHAO, Wan-Lei; and NGO, Chong-wah.
Efficient near-duplicate keyframe retrieval with visual language models. (2007). Proceedings of 2007 International Conference on Multimedia & Expo, Beijing July 2-5. 500-503.
Available at: https://ink.library.smu.edu.sg/sis_research/6603
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Included in
Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons