Turkish Journal of Electrical Engineering and Computer Sciences

High-speed data deduplication using parallelized cuckoo hashing

Abstract

Data deduplication is a capacity optimization technology used in backup systems for identifying and storing the nonredundant data blocks. The CPU intensive tasks involved in a hash-based deduplication system remain as challenges in improving the performance of the system. In this paper, we propose a parallel variant of the standard cuckoo hashing that enables the hashing technique to be performed in parallel. The CPU intensive tasks of fingerprint insertion and lookup operations are performed in parallel and distributed among the nodes of the deduplication cluster. Furthermore, the uniform handling of the blocks by the cluster nodes involved in the process of duplicate identification provides good load balance. Experimental evaluations using real-world backup and Linux kernel data sets reveal that the proposed deduplication system achieves up to 100{\%} higher backup speed, up to 28{\%} reduced lookup latency, and up to 24{\%} reduced backup time than the other deduplication systems.

DOI

10.3906/elk-1708-336

Keywords

Deduplication, parallelized cuckoo, backup

First Page

1417

Last Page

1429

Recommended Citation

JEYARAJ, J. R, KAMBARAJ, S, & DHARMARAJAN, V (2018). High-speed data deduplication using parallelized cuckoo hashing. Turkish Journal of Electrical Engineering and Computer Sciences 26 (3): 1417-1429. https://doi.org/10.3906/elk-1708-336

Download

Included in

Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons

COinS

Turkish Journal of Electrical Engineering and Computer Sciences

High-speed data deduplication using parallelized cuckoo hashing

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Issues by Year

Search

Turkish Journal of Electrical Engineering and Computer Sciences

High-speed data deduplication using parallelized cuckoo hashing

Authors

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Share

Issues by Year

Search