Publication Type
Journal Article
Version
publishedVersion
Publication Date
7-2004
Abstract
The importance of data mining is apparent with the advent of powerful data collection and storage tools; raw data is so abundant that manual analysis is no longer possible. Unfortunately, data mining problems are difficult to solve and this prompted the introduction of several novel data structures to improve mining efficiency. Here, we critically examine existing preprocessing data structures used in association rule mining for enhancing performance in an attempt to understand their strengths and weaknesses. Our analyses culminate in a practical structure called the SOTrielT (support-ordered trie itemset) and two synergistic algorithms to accompany it for the fast discovery of frequent itemsets. Experiments involving a wide range of synthetic data sets reveal that its algorithms outperform FP-growth, a recent association rule mining algorithm with excellent performance, by up to two orders of magnitude and, thus, verifying its' efficiency and viability.
Keywords
Association rule mining, Data mining, Data structures
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Knowledge and Data Engineering
Volume
16
Issue
7
First Page
875
Last Page
879
ISSN
1041-4347
Identifier
10.1109/TKDE.2004.1318569
Publisher
IEEE
Citation
LIM, Ee Peng; WOON, Yew-Kwong; and NG, Wee-Keong.
A support-ordered trie for fast frequent itemset discovery. (2004). IEEE Transactions on Knowledge and Data Engineering. 16, (7), 875-879.
Available at: https://ink.library.smu.edu.sg/sis_research/123
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TKDE.2004.1318569
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons