Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2010
Abstract
Because of the inherent ambiguity in user queries, an important task of modern retrieval systems is faceted topic retrieval (FTR), which relates to the goal of returning diverse or novel information elucidating the wide range of topics or facets of the query need. We introduce a generative model for hypothesizing facets in the (news) video domain by combining the complementary information in the visual keyframes and the speech transcripts. We evaluate the efficacy of our multimodal model on the standard TRECVID-2005 video corpus annotated with facets. We find that: (1) the joint modeling of the visual and text (speech transcripts) information can achieve significant F-score improvement over a text-alone system; (2) our model compares favorably with standard diverse ranking algorithms such as the MMR [1]. Our FTR model has been implemented on a news search prototype that is undergoing commercial trial.
Keywords
faceted topic retrieval, multimedia topic modeling, latent Dirichlet allocation
Discipline
Artificial Intelligence and Robotics | Databases and Information Systems
Research Areas
Information Systems and Management; Intelligent Systems and Optimization
Publication
Proceedings of the IEEE International Conference on Multimedia & Expo (ICME 2010)
First Page
843
Last Page
848
ISBN
9781424474912
Identifier
10.1109/ICME.2010.5583061
City or Country
Singapore
Citation
WAN, Kong-Wah; TAN, Ah-hwee; LIM, Joo-Hwee; and CHIA, Liang-Tien.
Faceted topic retrieval of news video using joint topic modeling of visual features and speech transcripts. (2010). Proceedings of the IEEE International Conference on Multimedia & Expo (ICME 2010). 843-848.
Available at: https://ink.library.smu.edu.sg/sis_research/6875
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.