Publication Type
Journal Article
Version
publishedVersion
Publication Date
1-2025
Abstract
Similarity search finds objects that are similar to a given query object based on a similarity metric. As the amount and variety of data continue to grow, similarity search in metric spaces has gained significant attention. Metric spaces can accommodate any type of data and support flexible distanc e metrics, making similarity search in metric spaces beneficial for many real-world applications, such as multimedia retrieval, personalized recommendation, trajectory analytics, data mining, decision planning, and distributed servers. However, existing studies mostly focus on indexing metric spaces on a single machine, which faces efficiency and scalability limitations with increasing data volume and query amount. Recent advancements in similarity search turn towards distributed methods, while they face challenges including inefficient local data management, unbalanced workload, and low concurrent search efficiency. To this end, we propose DIMS, an efficient Distributed Index for similarity search in Metric Spaces. First, we design a novel three-stage heterogeneous partition to achieve workload balance. Then, we present an effective three-stage indexing structure to efficiently manage objects. We also develop concurrent search methods with filtering and validation techniques that support efficient distributed similarity search. Additionally, we devise a cost-based optimization model to balance communication an d computation cost. Extensive experiments demonstrate that DIMS significantly outperforms existing distributed similarity search approaches.
Keywords
Similarity Search, Metric Space, Distributed Index, Homogeneous and Heterogeneous Partition
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Data Science and Engineering
Areas of Excellence
Digital transformation
Publication
IEEE Transactions on Knowledge and Data Engineering
Volume
37
Issue
1
First Page
210
Last Page
225
ISSN
1041-4347
Identifier
10.1109/TKDE.2024.3487759
Publisher
Institute of Electrical and Electronics Engineers
Citation
ZHU, Yifan; LUO, Chengyang; QIAN, Tang; CHEN, Lu; GAO, Yunjun; and ZHENG, Baihua.
DIMS: Distributed index for similarity search in metric spaces. (2025). IEEE Transactions on Knowledge and Data Engineering. 37, (1), 210-225.
Available at: https://ink.library.smu.edu.sg/sis_research/10154
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TKDE.2024.3487759
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons