Publication Type
Journal Article
Version
publishedVersion
Publication Date
9-2019
Abstract
Co-clustering addresses the problem of simultaneous clustering of both dimensions of a data matrix. When dealing with high dimensional sparse data, co-clustering turns out to be more beneficial than one-sided clustering even if one is interested in clustering along one dimension only. Aside from being high dimensional and sparse, some datasets, such as document-term matrices, exhibit directional characteristics, and the L2 normalization of such data, so that it lies on the surface of a unit hypersphere, is useful. Popular co-clustering assumptions such as Gaussian or Multinomial are inadequate for this type of data. In this paper, we extend the scope of co-clustering to directional data. We present Diagonal Block Mixture of Von Mises–Fisher distributions (dbmovMFs), a co-clustering model which is well suited for directional data lying on a unit hypersphere. By setting the estimate of the model parameters under the maximum likelihood (ML) and classification ML approaches, we develop a class of EM algorithms for estimating dbmovMFs from data. Extensive experiments, on several real-world datasets, confirm the advantage of our approach and demonstrate the effectiveness of our algorithms.
Keywords
Co-clustering, directional data, document clustering, EM algorithm, von Mises-Fisher distribution
Discipline
Databases and Information Systems
Publication
Advances in Data Analysis and Classification
Volume
13
Issue
3
First Page
591
Last Page
620
ISSN
1862-5347
Identifier
10.1007/s11634-018-0323-4
Publisher
Springer
Citation
SALAH, Aghiles and NADIF, Mohamed.
Directional co-clustering. (2019). Advances in Data Analysis and Classification. 13, (3), 591-620.
Available at: https://ink.library.smu.edu.sg/sis_research/10194
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/s11634-018-0323-4