Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2007
Abstract
Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by exploring both visual and textual cues from the visual vocabulary and semantic context respectively. The vocabulary, which provides entries for visual keywords, is formed by the clustering of local keypoints. The semantic context is inferred from the speech transcript surrounding a keyframe. We experiment the usefulness of visual keywords and semantic context, separately and jointly, using cosine similarity and language models. By linearly fusing both modalities, performance improvement is reported compared with the techniques with keypoint matching. While matching suffers from expensive computation due to the need of online nearest neighbor search, our approach is effective and efficient enough for online video search.
Keywords
Image retrieval, Language model, Multiple modalities, Near-duplicate keyframe, News videos, Similarity measure
Discipline
Databases and Information Systems | Theory and Algorithms
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR 2007, Amsterdam, July 9 - 11
First Page
162
Last Page
169
ISBN
9781595937339
Identifier
10.1145/1282280.1282309
Publisher
ACM
City or Country
Amsterdam
Citation
WU, Xiao; ZHAO, Wan-Lei; and NGO, Chong-wah.
Near-duplicate keyframe retrieval with visual keywords and semantic context. (2007). Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR 2007, Amsterdam, July 9 - 11. 162-169.
Available at: https://ink.library.smu.edu.sg/sis_research/6445
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.