Research Collection School Of Computing and Information Systems

Fast Filter-and-Refine Algorithms for Subsequence Selection

Beng-Chin OOI, National University of Singapore
Hwee Hwa PANG, Singapore Management UniversityFollow
Hao WANG, National University of Singapore
Limsoon WONG, Lab for Information Technology, Singapore
Cui YU, National University of Singapore

Publication Type

Conference Proceeding Article

Version

submittedVersion

Publication Date

7-2002

Abstract

Large sequence databases, such as protein, DNA and gene sequences in biology, are becoming increasingly common. An important operation on a sequence database is approximate subsequence matching, where all subsequences that are within some distance from a given query string are retrieved. This paper proposes a filter-and-refine algorithm that enables efficient approximate subsequence matching in large DNA sequence databases. It employs a bitmap indexing structure to condense and encode each data sequence into a shorter index sequence. During query processing, the bitmap index is used to filter out most of the irrelevant subsequences, and false positives are removed in the final refinement step. Analytical and experimental studies show that the proposed strategy is capable of reducing response time substantially while incurring only a small space overhead.

Keywords

DNA sequences, approximate subsequence matching, biology, bitmap indexing structure, data sequence condensing, data sequence encoding, false positive removal, fast filter-and-refine algorithms, gene sequences, index sequence, large DNA sequence databases, large sequence databases, protein sequences, query processing, query string, response time, small space overhead, subsequence selection

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Publication

IDEAS 2002: International Database Engineering and Applications Symposium 2002, July 17-19, Edmonton, Canada

First Page

243

Last Page

254

ISBN

9780769516387

Identifier

10.1109/IDEAS.2002.1029677

Publisher

IEEE Computer Society

City or Country

Los Alamitos, CA

Citation

OOI, Beng-Chin; PANG, Hwee Hwa; WANG, Hao; WONG, Limsoon; and YU, Cui. Fast Filter-and-Refine Algorithms for Subsequence Selection. (2002). IDEAS 2002: International Database Engineering and Applications Symposium 2002, July 17-19, Edmonton, Canada. 243-254.
Available at: https://ink.library.smu.edu.sg/sis_research/1144

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.ieeecomputersociety.org/10.1109/IDEAS.2002.1029677

Download

Find it in your library

Included in

Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

Fast Filter-and-Refine Algorithms for Subsequence Selection

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Fast Filter-and-Refine Algorithms for Subsequence Selection

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links