Research Collection School Of Computing and Information Systems

Efficient Evaluation of Continuous Text Seach Queries

Publication Type

Journal Article

Version

publishedVersion

Publication Date

10-2011

Abstract

Consider a text filtering server that monitors a stream of incoming documents for a set of users, who register their interests in the form of continuous text search queries. The task of the server is to constantly maintain for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring.In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is to support a large number of user queries while sustaining high document arrival rates. Our solution indexes the streamed documents in main memory with a structure based on the principles of the inverted file, and processes document arrival and expiration events with an incremental threshold-based method. We distinguish between two versions of the monitoring algorithm, an eager and a lazy one, which differ in how aggressively they manage the thresholds on the inverted index. Using benchmark queries over a stream of real documents, we experimentally verify the efficiency of our methodology; both its versions are at least an order of magnitude faster than a competitor constructed from existing techniques, with lazy being the best approach overall.

Keywords

Continuous queries, document streams, text filtering

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Publication

IEEE Transactions on Knowledge and Data Engineering

Volume

Issue

First Page

1469

Last Page

1482

ISSN

1041-4347

Identifier

10.1109/TKDE.2011.125

Publisher

IEEE

Citation

MOURATIDIS, Kyriakos and PANG, Hwee Hwa. Efficient Evaluation of Continuous Text Seach Queries. (2011). IEEE Transactions on Knowledge and Data Engineering. 23, (10), 1469-1482.
Available at: https://ink.library.smu.edu.sg/sis_research/812

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.org/10.1109/TKDE.2011.125

Download

Find it in your library

Included in

Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

Efficient Evaluation of Continuous Text Seach Queries

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Efficient Evaluation of Continuous Text Seach Queries

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links