Publication Type
Journal Article
Version
acceptedVersion
Publication Date
3-2013
Abstract
Co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multi-modal features. In this paper, we propose a generalized form of Heterogeneous Fusion Adaptive Resonance Theory, called GHF-ART, for co-clustering of large-scale web multimedia documents. By extending the two-channel Heterogeneous Fusion ART (HF-ART) to multiple channels, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling short and noisy text, GHF-ART does not learn directly from the textual features. Instead, it identifies key tags by learning the probabilistic distribution of tag occurrences. More importantly, GHF-ART incorporates an adaptive method for effective fusion of multi-modal features, which weights the features of multiple data sources by incrementally measuring the importance of feature modalities through the intra-cluster scatters. Extensive experiments on two web image data sets and one text document set have shown that GHF-ART achieves significantly better clustering performance and is much faster than many existing state-of-the-art algorithms.
Keywords
Semi-supervised learning, heterogeneous data co-clustering, multimedia data mining
Discipline
Databases and Information Systems | Data Storage Systems
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Knowledge and Data Engineering
Volume
26
Issue
9
First Page
2293
Last Page
2306
ISSN
1041-4347
Identifier
10.1109/TKDE.2013.47
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
MENG, Lei; TAN, Ah-hwee; and XU, Dong.
Semi-supervised heterogeneous fusion for multimedia data co-clustering. (2013). IEEE Transactions on Knowledge and Data Engineering. 26, (9), 2293-2306.
Available at: https://ink.library.smu.edu.sg/sis_research/5231
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TKDE.2013.47