Research Collection School Of Computing and Information Systems

Efficient Processing of Exact Top-k Queries over Disk-Resident Sorted Lists

Hwee Hwa PANG, Singapore Management UniversityFollow
Xuhua DING, Singapore Management UniversityFollow
Baihua ZHENG, Singapore Management UniversityFollow

Publication Type

Journal Article

Version

submittedVersion

Publication Date

6-2010

Abstract

The top-k query is employed in a wide range of applications to generate a ranked list of data that have the highest aggregate scores over certain attributes. As the pool of attributes for selection by individual queries may be large, the data are indexed with per-attribute sorted lists, and a threshold algorithm (TA) is applied on the lists involved in each query. The TA executes in two phases--find a cut-off threshold for the top-k result scores, then evaluate all the records that could score above the threshold. In this paper, we focus on exact top-k queries that involve monotonic linear scoring functions over disk-resident sorted lists. We introduce a model for estimating the depths to which each sorted list needs to be processed in the two phases, so that (most of) the required records can be fetched efficiently through sequential or batched I/Os. We also devise a mechanism to quickly rank the data that qualify for the query answer and to eliminate those that do not, in order to reduce the computation demand of the query processor. Extensive experiments with four different datasets confirm that our schemes achieve substantial performance speed-up of between two times and two orders of magnitude over existing TAs, at the expense of a memory overhead of 4.8 bits per attribute value. Moreover, our scheme is robust to different data distributions and query characteristics.

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Publication

VLDB Journal

Volume

Issue

First Page

437

Last Page

456

ISSN

1066-8888

Identifier

10.1007/s00778-009-0174-x

Publisher

Springer Verlag

Citation

PANG, Hwee Hwa; DING, Xuhua; and ZHENG, Baihua. Efficient Processing of Exact Top-k Queries over Disk-Resident Sorted Lists. (2010). VLDB Journal. 19, (3), 437-456.
Available at: https://ink.library.smu.edu.sg/sis_research/800

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://dx.doi.org/10.1007/s00778-009-0174-x

Download

Find it in your library

Included in

Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons

COinS

Research Collection School Of Computing and Information Systems

Efficient Processing of Exact Top-k Queries over Disk-Resident Sorted Lists

Publication Type

Version

Publication Date

Abstract

Discipline

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Efficient Processing of Exact Top-k Queries over Disk-Resident Sorted Lists

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links