Publication Type

Journal Article

Version

publishedVersion

Publication Date

1-2025

Abstract

Similarity search finds objects that are similar to a given query object based on a similarity metric. As the amount and variety of data continue to grow, similarity search in metric spaces has gained significant attention. Metric spaces can accommodate any type of data and support flexible distanc e metrics, making similarity search in metric spaces beneficial for many real-world applications, such as multimedia retrieval, personalized recommendation, trajectory analytics, data mining, decision planning, and distributed servers. However, existing studies mostly focus on indexing metric spaces on a single machine, which faces efficiency and scalability limitations with increasing data volume and query amount. Recent advancements in similarity search turn towards distributed methods, while they face challenges including inefficient local data management, unbalanced workload, and low concurrent search efficiency. To this end, we propose DIMS, an efficient Distributed Index for similarity search in Metric Spaces. First, we design a novel three-stage heterogeneous partition to achieve workload balance. Then, we present an effective three-stage indexing structure to efficiently manage objects. We also develop concurrent search methods with filtering and validation techniques that support efficient distributed similarity search. Additionally, we devise a cost-based optimization model to balance communication an d computation cost. Extensive experiments demonstrate that DIMS significantly outperforms existing distributed similarity search approaches.

Keywords

Similarity Search, Metric Space, Distributed Index, Homogeneous and Heterogeneous Partition

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Areas of Excellence

Digital transformation

Publication

IEEE Transactions on Knowledge and Data Engineering

Volume

37

Issue

1

First Page

210

Last Page

225

ISSN

1041-4347

Identifier

10.1109/TKDE.2024.3487759

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TKDE.2024.3487759

Share

COinS