Publication Type
Journal Article
Version
publishedVersion
Publication Date
10-2009
Abstract
Recent years have observed increasing efforts on graph mining and many algorithms have been developed for this purpose. However, most of the existing algorithms are designed for discovering frequent subgraphs in a set of labeled graphs only. Also, the few algorithms that find frequent subgraphs in a single labeled graph typically identify subgraphs appearing regionally in the input graph. In contrast, for real-world applications, it is commonly required that the identified frequent subgraphs in a single labeled graph should also be globally distributed. This paper thus fills this crucial void by proposing a new measure, termed G-Measure, to find globally distributed frequent subgraphs, called G-Patterns, in a single labeled graph. Specifically, we first show that the G-Patterns, selected by G-Measure, tend to be globally distributed in the input graph. Then, we present that G-Measure has the downward closure property, which guarantees the G-Measure value of a G-Pattern is not less than those of its supersets. Consequently, a G-Miner algorithm is developed for finding G-Patterns. Experimental results on four synthetic and seven real-world data sets and comparison with the existing algorithms demonstrate the efficacy of the G-Measure and the G-Miner for finding G-Patterns. Finally, an application of the G-Patterns is given
Keywords
Frequent subgraph mining, G-Measure, G-Pattern
Discipline
Computer Engineering | Databases and Information Systems | Data Storage Systems
Research Areas
Data Science and Engineering
Publication
Data and Knowledge Engineering
Volume
68
Issue
10
First Page
1034
Last Page
1058
ISSN
0169-023X
Identifier
10.1016/j.datak.2009.04.008
Publisher
Elsevier: 24 months
Citation
JIANG, Xing; XIONG, Hui; WANG, Chen; and TAN, Ah-hwee.
Mining globally distributed frequent subgraphs in a single labeled graph. (2009). Data and Knowledge Engineering. 68, (10), 1034-1058.
Available at: https://ink.library.smu.edu.sg/sis_research/5196
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1016/j.datak.2009.04.008