The top-k dominating (TKD) query returns the k objects that dominate the maximum number of objects in a given dataset. It combines the advantages of skyline and top-k queries, and plays an important role in many decision support applications. Incomplete data exists in a wide spectrum of real datasets, due to device failure, privacy preservation, data loss, and so on. In this paper, for the first time, we carry out a systematic study of TKD queries on incomplete data, which involves the data having some missing dimensional value(s). We formalize this problem, and propose a suite of efficient algorithms for answering TKD queries over incomplete data. Our methods employ some novel techniques, such as upper bound score pruning, bitmap pruning, and partial score pruning, to boost query efficiency. Extensive experimental evaluation using both real and synthetic datasets demonstrates the effectiveness of our developed pruning heuristics and the performance of our presented algorithms.
Top-k dominating query, Incomplete data, Query processing, Dominance relationship, Algorithm
Computer Sciences | Databases and Information Systems | Numerical Analysis and Scientific Computing
Data Management and Analytics
IEEE Transactions on Knowledge and Data Engineering
Institute of Electrical and Electronics Engineers (IEEE)
MIAO, Xiaoye; GAO, Yunjun; ZHENG, Baihua; CHEN, Gang; and CUI, Huiyong.
Top-k Dominating Queries on Incomplete Data. (2016). IEEE Transactions on Knowledge and Data Engineering. 28, (1), 252-266. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/2894
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.