Publication Type

Journal Article

Version

acceptedVersion

Publication Date

3-2013

Abstract

Co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multi-modal features. In this paper, we propose a generalized form of Heterogeneous Fusion Adaptive Resonance Theory, called GHF-ART, for co-clustering of large-scale web multimedia documents. By extending the two-channel Heterogeneous Fusion ART (HF-ART) to multiple channels, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling short and noisy text, GHF-ART does not learn directly from the textual features. Instead, it identifies key tags by learning the probabilistic distribution of tag occurrences. More importantly, GHF-ART incorporates an adaptive method for effective fusion of multi-modal features, which weights the features of multiple data sources by incrementally measuring the importance of feature modalities through the intra-cluster scatters. Extensive experiments on two web image data sets and one text document set have shown that GHF-ART achieves significantly better clustering performance and is much faster than many existing state-of-the-art algorithms.

Keywords

Semi-supervised learning, heterogeneous data co-clustering, multimedia data mining

Discipline

Databases and Information Systems | Data Storage Systems

Research Areas

Data Science and Engineering

Publication

IEEE Transactions on Knowledge and Data Engineering

Volume

Issue

First Page

2293

Last Page

2306

ISSN

1041-4347

Identifier

10.1109/TKDE.2013.47

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

MENG, Lei; TAN, Ah-hwee; and XU, Dong. Semi-supervised heterogeneous fusion for multimedia data co-clustering. (2013). IEEE Transactions on Knowledge and Data Engineering. 26, (9), 2293-2306.
Available at: https://ink.library.smu.edu.sg/sis_research/5231

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/TKDE.2013.47

Download

Find it in your library

Included in

Databases and Information Systems Commons, Data Storage Systems Commons

COinS

Research Collection School Of Computing and Information Systems

Semi-supervised heterogeneous fusion for multimedia data co-clustering

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Semi-supervised heterogeneous fusion for multimedia data co-clustering

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links