Indexing Metric Uncertain Data for Range Queries
Publication Type
Conference Proceeding Article
Publication Date
6-2015
Abstract
Range queries in metric spaces have applications in many areas such as multimedia retrieval, computational biology, and location-based services, where metric uncertain data exists in different forms, resulting from equipment limitations, high-throughput sequencing technologies, privacy preservation, or others. In this paper, we represent metric uncertain data by using an object-level model and a bi-level model, respectively. Two novel indexes, the uncertain pivot B+-tree (UPB-tree) and the uncertain pivot B+-forest (UPB-forest), are proposed accordingly in order to support probabilistic range queries w.r.t. a wide range of uncertain data types and similarity metrics. Both index structures use a small set of effective pivots chosen based on a newly defined criterion, and employ the B+-tree(s) as the underlying index. By design, they are easy to be integrated into any existing DBMS. In addition, we present efficient metric probabilistic range query algorithms, which utilize the validation and pruning techniques based on our derived probability lower and upper bounds. Extensive experiments with both real and synthetic data sets demonstrate that, compared against existing state-of-the-art indexes for metric uncertain data, the UPB-tree and UPB-forest incur much lower construction costs, consume smaller storage spaces, and can support more efficient metric probabilistic range queries.
Keywords
metric space, index structure, range query, uncertain data
Discipline
Computer Sciences | Databases and Information Systems | Numerical Analysis and Scientific Computing
Publication
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, May 31-June 4, Melbourne
First Page
951
Last Page
965
ISBN
9781450327589
Identifier
10.1145/2723372.2723728
Publisher
ACM
City or Country
New York
Citation
CHEN, Lu; GAO, Yunjun; LI, Xinhan; JENSEN, Christian S.; CHEN, Gang; and ZHENG, Baihua.
Indexing Metric Uncertain Data for Range Queries. (2015). SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, May 31-June 4, Melbourne. 951-965.
Available at: https://ink.library.smu.edu.sg/sis_research/2892
Additional URL
http://dx.doi.org/10.1145/2723372.2723728