Multiple continuous top-K queries over data stream
Publication Type
Conference Proceeding Article
Publication Date
5-2024
Abstract
Continuous top-kk query over sliding window is a fundamental challenge in the domain of streaming data management. Specifically, a continuous top-k query qq monitors the window WW, returning the kk objects with the highest scores to the system with each slide of the window. This paper delves into one of its important variants, referred to as multiple continuous top. kk queries over data stream, which holds significant applications. While various efforts have been made to support continuous top-k query, few have addressed the complexities of multiple continuous top-k queries. The prevailing approach involves selecting a minimal number of objects in the window as candidates, incrementally maintaining them, and using them to support query processing as efficiently as possible. However, these endeavors exhibit sensitivity to the query workload scale or query parameters such as kk, the window length nn, and others. Consequently, they incur high running/space cost in updating the candidate set. In this paper, we propose a novel index PH-Tree (Partition and Heap-based Binary Tree), designed to facilitate multiple continuous top-k queries. We partition the query window into a group of disjoint partitions and use PH-Tree to organize these partitions. Additionally, the PH-Tree allows for flexible candidate selection based on the size of each partition, parameter distribution of queries and score distribution of objects. We further develop a group of efficient algorithms to support candidate set incremental maintenance and query processing. The effectiveness and efficiency of the proposed algorithms are validated through extensive theoretical analysis and exneriments detailed in this paper.
Keywords
Sensitivity, Costs, Query processing, Binary trees, Data engineering, Partitioning algorithms, Maintenance
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, Netherlands, May 13-16
First Page
1575
Last Page
1588
ISBN
9798350317169
Identifier
10.1109/ICDE60146.2024.00129
Publisher
IEEE
City or Country
Los Alamitos, CA
Citation
ZHU, Rui; JIA, Yujin; YANG, Xiaochun; ZHENG, Baihua; WANG, Bin; and ZONG, Chuanyu.
Multiple continuous top-K queries over data stream. (2024). Proceedings of the 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, Netherlands, May 13-16. 1575-1588.
Available at: https://ink.library.smu.edu.sg/sis_research/9284
Additional URL
https://doi.org/10.1109/ICDE60146.2024.00129