Publication Type

Journal Article

Publication Date

6-2005

Abstract

Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,...qn, an ANN query outputs the facility p belongs to P that minimizes the sum of distances |pqi| for 1 is less than or equal to i is less than or equal to n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p belongs to P that minimizes the maximum distance that any user has to travel, or the minimum distance from some user to his/her closest facility. If Q fits in memory and P is indexed by an R-tree, we develop algorithms for aggregate nearest neighbors that capture several versions of the problem, including weighted queries and incremental reporting of results. Then, we analyze their performance and propose cost models for query optimization. Finally, we extend our techniques for disk-resident queries and approximate ANN retrieval. The efficiency of the algorithms and the accuracy of the cost models are evaluated through extensive experiments with real and synthetic datasets.

Keywords

Aggregation, Nearest neighbor queries, Spatial database, weighted queries

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing | Theory and Algorithms

Research Areas

Data Management and Analytics

Publication

ACM Transactions on Database Systems

Volume

30

Issue

2

First Page

529

Last Page

576

ISSN

0362-5915

Identifier

10.1145/1071610.1071616

Publisher

ACM

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://dx.doi.org/10.1145/1071610.1071616