Publication Type
Journal Article
Version
acceptedVersion
Publication Date
12-2016
Abstract
Given a setS of multidimensional objects and a query object q, a k nearest neighbor (kNN) query finds from S the k closest objects to q. This query is a fundamental problem in database, data mining, and information retrieval research. It plays an important role in a wide spectrum of real applications such as image recognition and location-based services. However, due to the failure of data transmission devices, improper storage, and accidental loss, incomplete data exist widely in those applications, where some dimensional values of data items are missing. In this paper, we systematically study incomplete k nearest neighbor (IkNN) search, which aims at the kNN query for incomplete data. We formalize this problem and propose an efficient lattice partition algorithm using our newly developed LαB index to support exact IkNN retrieval, with the help of two pruning heuristics, i.e., α value pruning and partial distance pruning. Furthermore, we propose an approximate algorithm, namely histogram approximate, to support approximate IkNN search with improved search efficiency and guaranteed error bound. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of newly designed indexes and pruning heuristics, as well as the performance of our presented algorithms under a variety of experimental settings.
Keywords
k Nearest Neighbor Search, Incomplete Data, Query Processing
Discipline
Computer Sciences | Theory and Algorithms
Publication
IEEE Transactions on Fuzzy Systems
Volume
24
Issue
6
First Page
1349
Last Page
1363
ISSN
1063-6706
Identifier
10.1109/TFUZZ.2016.2516562
Publisher
IEEE
Citation
MIAO, Xiaoye; GAO, Yunjun; CHEN, Gang; ZHENG, Baihua; and CUI, Huiyong.
Processing Incomplete k Nearest Neighbor Search. (2016). IEEE Transactions on Fuzzy Systems. 24, (6), 1349-1363.
Available at: https://ink.library.smu.edu.sg/sis_research/3322
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
http://doi.org/10.1109/TFUZZ.2016.2516562