Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
12-2006
Abstract
Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 21st International Conference on Computer Processing of Oriental Languages (ICCPOL 2006)
First Page
97
Last Page
108
Identifier
10.1007/11940098_10
Publisher
LNAI, Springer
City or Country
Singapore
Citation
GAO, Wei; WONG, Kam-Fai; XIA, Yunqing; and XU, Ruifeng.
Clique percolation for finding naturally cohesive and overlapping document clusters. (2006). Proceedings of the 21st International Conference on Computer Processing of Oriental Languages (ICCPOL 2006). 97-108.
Available at: https://ink.library.smu.edu.sg/sis_research/4602
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/11940098_10