Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
4-2011
Abstract
Multimedia documents in popular image and video sharing websites such as Flickr and Youtube are heterogeneous documents with diverse ways of representations and rich user-supplied information. In this paper, we investigate how the agreement among heterogeneous modalities can be exploited to guide data fusion. The problem of fusion is cast as the simultaneous mining of agreement from different modalities and adaptation of fusion weights to construct a fused graph from these modalities. An iterative framework based on agreement-fusion optimization is thus proposed. We plug in two well-known algorithms: random walk and semi-supervised learning to this framework to illustrate the idea of how agreement (conflict) is incorporated (compromised) in the case of uniform and adaptive fusion. Experimental results on web video and image re-ranking demonstrate that, by proper fusion strategy rather than simple linear fusion, performance improvement on search can generally be expected.
Keywords
graph fusion, heterogeneous modality fusion, modality agreement, re-ranking
Discipline
Data Storage Systems | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 1st ACM International Conference on Multimedia Retrieval: ICMR '11, Trento, Italy, April 17-20
First Page
1
Last Page
8
ISBN
9781450303361
Identifier
10.1145/1991996.1992011
Publisher
ACM
City or Country
Trento, Italy
Citation
TAN, Hung-Khoon and NGO, Chong-wah.
Fusing heterogeneous modalities for video and image re-ranking. (2011). Proceedings of the 1st ACM International Conference on Multimedia Retrieval: ICMR '11, Trento, Italy, April 17-20. 1-8.
Available at: https://ink.library.smu.edu.sg/sis_research/6518
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.