Publication Type
Journal Article
Version
acceptedVersion
Publication Date
5-2017
Abstract
The efficient processing of document streams plays an important role in many information filtering systems. Emerging applications, such as news update filtering and social network notifications, demand presenting end-users with the most relevant content to their preferences. In this work, user preferences are indicated by a set of keywords. A central server monitors the document stream and continuously reports to each user the top-k documents that are most relevant to her keywords. Our objective is to support large numbers of users and high stream rates, while refreshing the top-k results almost instantaneously. Our solution abandons the traditional frequency-ordered indexing approach. Instead, it follows an identifier-ordering paradigm that suits better the nature of the problem. When complemented with a novel, locally adaptive technique, our method offers (i) proven optimality w.r.t. the number of considered queries per stream event, and (ii) an order of magnitude shorter response time (i.e., time to refresh the query results) than the current state-of-the-art.
Keywords
Top-k query, Continuous query, Document stream
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Intelligent Systems and Optimization
Publication
IEEE Transactions on Knowledge and Data Engineering
Volume
29
Issue
5
First Page
991
Last Page
1003
ISSN
1041-4347
Identifier
10.1109/TKDE.2017.2657622
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
U, Leong Hou; ZHANG, Junjie; MOURATIDIS, Kyriakos; and LI, Ye.
Continuous Top-k monitoring on document streams. (2017). IEEE Transactions on Knowledge and Data Engineering. 29, (5), 991-1003.
Available at: https://ink.library.smu.edu.sg/sis_research/3643
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TKDE.2017.2657622
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons