Publication Type
Journal Article
Version
submittedVersion
Publication Date
7-2020
Abstract
Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other. To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering. Keywords: co-clustering, heterogeneous information network, meta-paths, matrix tri-factorization, semi-supervised learning
Keywords
co-clustering, heterogeneous information network, meta-paths, matrix tri-factorization, semi-supervised learning
Discipline
Databases and Information Systems | OS and Networks
Research Areas
Data Science and Engineering
Publication
Information Processing and Management
Volume
57
Issue
6
First Page
1
Last Page
12
ISSN
0306-4573
Identifier
10.1016/j.ipm.2020.102338
Publisher
Elsevier
Citation
1
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1016/j.ipm.2020.102338
Comments
The embargo period should be 2 years -- not sure why under the drop down I can only select one year. Please validate.