Research Collection School Of Computing and Information Systems

Efficient Discovery of Frequent Approximate Sequential Patterns

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2007

Abstract

We propose an efficient algorithm for mining frequent approximate sequential patterns under the Hamming distance model. Our algorithm gains its efficiency by adopting a "break-down-and-build-up" methodology. The "breakdown" is based on the observation that all occurrences of a frequent pattern can be classified into groups, which we call strands. We developed efficient algorithms to quickly mine out all strands by iterative growth. In the "build-up" stage, these strands are grouped up to form the support sets from which all approximate patterns would be identified. A salient feature of our algorithm is its ability to grow the frequent patterns by iteratively assembling building blocks of significant sizes in a local search fashion. By avoiding incremental growth and global search, we achieve greater efficiency without losing the completeness of the mining result. Our experimental studies demonstrate that our algorithm is efficient in mining globally repeating approximate sequential patterns that would have been missed by existing methods.

Keywords

Hamming distance model, Hamming distance model, approximate sequential patterns, break-down-and-build-up methodology, frequent approximate sequential patterns, global search, incremental growth

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Publication

IEEE 7th International Conference on Data Mining Seventh: ICDM 2007: October 28-31, Omaha, Nebraska: Proceedings

First Page

751

Last Page

756

ISBN

9780769530185

Identifier

10.1109/ICDM.2007.75

Publisher

IEEE Computer Society

City or Country

Los Alamitos, CA

Citation

ZHU, Feida; YAN, Xifeng; HAN, Jiawei; and YU, Philip S.. Efficient Discovery of Frequent Approximate Sequential Patterns. (2007). IEEE 7th International Conference on Data Mining Seventh: ICDM 2007: October 28-31, Omaha, Nebraska: Proceedings. 751-756.
Available at: https://ink.library.smu.edu.sg/sis_research/933

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.ieeecomputersociety.org/10.1109/ICDM.2007.75

Download

Find it in your library

Included in

Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

Efficient Discovery of Frequent Approximate Sequential Patterns

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Efficient Discovery of Frequent Approximate Sequential Patterns

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links