Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

4-2021

Abstract

The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naïve solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table technique on GPUs, known as DyCuckoo. We devise a resizing strategy for dynamic scenarios without rehashing the entire table that ensures a guaranteed filled factor. The strategy trades search performance with resizing efficiency, and this tradeoff can be configured by users. To further improve efficiency, we propose a 2-in-d cuckoo hashing scheme that ensures a maximum of two lookups for find and delete operations, while retaining similar performance for insertions as a general cuckoo hash. Extensive experiments have validated the proposed design's effectiveness over several state-of-the-art hash table implementations on GPUs. DyCuckoo achieves superior efficiency while enables fine-grained memory control, which is not available in existing GPU hash table approaches.

Discipline

Databases and Information Systems | Data Storage Systems

Research Areas

Data Science and Engineering

Publication

Proceedings of 2021 IEEE 37th International Conference on Data Engineering (ICDE 2021), Chania, Greece, April 19-22

First Page

744

Last Page

755

ISBN

9781728191843

Identifier

10.1109/ICDE51399.2021.00070

Publisher

IEEE

City or Country

USA

Share

COinS