Publication Type

Journal Article

Publication Date

5-2017

Abstract

The efficient processing of document streams plays animportant role in many information filtering systems. Emerging applications,such as news update filtering and social network notifications, demandpresenting end-users with the most relevant content to their preferences. Inthis work, user preferences are indicated by a set of keywords. A centralserver monitors the document stream and continuously reports to each user thetop-k documents that are most relevant to her keywords. Our objective is tosupport large numbers of users and high stream rates, while refreshing thetop-k results almost instantaneously. Our solution abandons the traditionalfrequency-ordered indexing approach. Instead, it follows an identifier-orderingparadigm that suits better the nature of the problem. When complemented with anovel, locally adaptive technique, our method offers (i) proven optimalityw.r.t. the number of considered queries per stream event, and (ii) an order ofmagnitude shorter response time (i.e., time to refresh the query results) than thecurrent state-of-the-art.

Keywords

Top-k query, Continuous query, Document stream

Discipline

Databases and Information Systems | Management Information Systems

Research Areas

Data Management and Analytics

Publication

IEEE Transactions on Knowledge and Data Engineering

Volume

29

Issue

5

First Page

991

Last Page

1003

ISSN

1041-4347

Identifier

10.1109/TKDE.2017.2657622

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://doi.org/10.1109/TKDE.2017.2657622

Share

COinS