Publication Type
Journal Article
Version
publishedVersion
Publication Date
8-2017
Abstract
Range queries and range joins in metric spaces have applications in many areas, including GIS, computational biology, and data integration, where metric uncertain data exist in different forms, resulting from circumstances such as equipment limitations, high-throughput sequencing technologies, and privacy preservation. We represent metric uncertain data by using an object-level model and a bi-level model, respectively. Two novel indexes, the uncertain pivot B+-tree (UPB-tree) and the uncertain pivot B+-forest (UPB-forest), are proposed in order to support probabilistic range queries and range joins for a wide range of uncertain data types and similarity metrics. Both index structures use a small set of effective pivots chosen based on a newly defined criterion and employ the B+-tree(s) as the underlying index. In addition, we present efficient metric probabilistic range query and metric probabilistic range join algorithms, which utilize validation and pruning techniques based on derived probability lower and upper bounds. Extensive experiments with both real and synthetic data sets demonstrate that, compared against existing state-of-the-art indexes for metric uncertain data, the UPB-tree and the UPB-forest incur much lower construction costs, consume less storage space, and can support more efficient metric probabilistic range queries and metric probabilistic range joins.
Keywords
Range query, Range join, Uncertain data, Metric space, Index structure
Discipline
Databases and Information Systems | Data Storage Systems
Research Areas
Data Science and Engineering
Publication
VLDB Journal
Volume
26
Issue
4
First Page
585
Last Page
610
ISSN
1066-8888
Identifier
10.1007/s00778-017-0465-6
Publisher
Springer Verlag (Germany)
Citation
CHEN, Lu; GAO, Yunjun; ZHONG, Aoxiao; JENSEN, Christian S.; CHEN, Gang; and ZHENG, Baihua.
Indexing metric uncertain data for range queries and range joins. (2017). VLDB Journal. 26, (4), 585-610.
Available at: https://ink.library.smu.edu.sg/sis_research/3707
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/s00778-017-0465-6