Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
4-2007
Abstract
Extensive research for frequent-pattern mining in the past decade has brought forth a number of pattern mining algorithms that are both effective and efficient. However, the existing frequent-pattern mining algorithms encounter challenges at mining rather large patterns, called colossal frequent patterns, in the presence of an explosive number of frequent patterns. Colossal patterns are critical to many applications, especially in domains like bioinformatics. In this study, we investigate a novel mining approach called Pattern-Fusion to efficiently find a good approximation to the colossal patterns. With Pattern-Fusion, a colossal pattern is discovered by fusing its small core patterns in one step, whereas the incremental pattern-growth mining strategies, such as those adopted in Apriori and FP-growth, have to examine a large number of mid-sized ones. This property distinguishes Pattern-Fusion from all the existing frequent pattern mining approaches and draws a new mining methodology. Our empirical studies show that, in cases where current mining algorithms cannot proceed, Pattern-Fusion is able to mine a result set which is a close enough approximation to the complete set of the colossal patterns, under a quality evaluation model proposed in this paper.
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Publication
IEEE 23rd International Conference on Data Engineering 2007: 15 - 20 April, Istanbul, Turkey: Proceedings
First Page
706
Last Page
715
ISBN
9781424408023
Identifier
10.1109/ICDE.2007.367916
Publisher
IEEE Computer Society
City or Country
Los Alamitos, CA
Citation
ZHU, Feida; YAN, Xifeng; HAN, Jiawei; YU, Philip S.; and CHENG, Hong.
Mining Colossal Frequent Patterns by Core Pattern Fusion. (2007). IEEE 23rd International Conference on Data Engineering 2007: 15 - 20 April, Istanbul, Turkey: Proceedings. 706-715.
Available at: https://ink.library.smu.edu.sg/sis_research/1007
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.ieeecomputersociety.org/10.1109/ICDE.2007.367916
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons