Improving Cuckoo Hashing With Perfect Hashing

Uploaded by

Mahmoudghm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Improving Cuckoo Hashing With Perfect Hashing

Uploaded by

Mahmoudghm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Int'l Conf. Software Eng.

Research and Practice | SERP'17 | 143

Improving Cuckoo Hashing with Perfect Hashing

Moulika Chadalavada and Yijie Han
School of Computing and Engineering
University of Missouri at Kansas City
5100 Rockhill Road
Kansas City, MO 64110, USA.
[email protected], [email protected]

Hashing. In this paper, we use Perfect Hashing to improve

Abstract - This paper mainly aims at improving Cuckoo
Cuckoo Hashing in terms of memory utilization and allocating
hashing by using Perfect Hashing to store the keys in memory
memory based on frequency of keys.
based on frequency. Perfect Hashing is fast and hit ratio is
high in Perfect Hashing. Cuckoo Hashing has high memory
usage in allocating keys to its memory. So, combining Cuckoo 2 What is Cuckoo Hashing
Hashing and Perfect Hashing will increase the keys hit ratio. Cuckoo hashing uses a number d of hash tables and an
element x can be placed in those tables 1, 2, …d in positions
Keywords: Tree, Hashing, Cuckoo Hashing, Perfect h1(x), h2(x), …... where hi(x) is hash functions [1]. The main
Hashing, Algorithms difference between d-left hashing [5] and Cuckoo hashing is
that, in d-left hashing when all positions are occupied the new
element cannot be inserted and limits memory usage. But in
1 Introduction Cuckoo Hashing the elements in occupied position is moved
There are many hashing techniques that aim at storing to their alternative positions to insert new element. There are
keys in memory to increase key access efficiency and to make many implementations of Cuckoo Hashing which aims at
hashing efficient. In network applications packet classification increasing throughput [1]. The first case is Serial
plays very prominent role [2]. One option to increase implementation in which tables are accessed serially and in
throughput is to use the algorithms based on hashing [3]. Hash Parallel implementation, the table is selected in random.
Table or Hash Map is a data structure that is used in
implementing structure that can map keys to values. A hash In Pipeline architecture [1], searching access each
table uses Hash Function to compute index into array, from memory sequentially i.e. when current option moved to
which the required values are found. The main disadvantage memory-2, the second search operation can start accessing
of Hash Tables is that it maps multiple keys to same index memory-1. In this pipeline implementation when one search is
thus results in collision in hashing. To handle this issue many successful other memories need not have to be accessed. In
hashing techniques are introduced to avoid collisions in Parallel d-pipeline [1] each pipeline has different entry point
allocating memory. which allows the user to insert an element to any table idle in
that cycle. If an element is in first pipeline and match found in
Cuckoo Hashing is one of the hash table schema which Table-1 then in next cycle element will be inserted in Second
provides high memory utilization and constant access time pipeline to make use of Table-2.
[4]. Cuckoo Hashing mainly aims at reducing collisions and
optimizing the throughput. There were many implementations 3 What is Perfect Hashing
in Cuckoo Hashing such as Serial Implementation, Parallel
Implementation, Parallel Pipeline Implementation, Parallel d- A Perfect Hash function for a set S is a hash function
Pipeline Implementation [1]. that maps distinct elements in S to a set of integers, with no
collisions. Minimal Perfect Hashing [9] guarantees that n keys
Perfect Hash Function is a hash function which maps will map to 0..n-1 with no collisions at all. Given set of n
distinct element of subset S to set of integers with no keys, a static hash table of size m=O(n) can be constructed
collision. However, in perfect hashing the set of keys to be such that Search takes O(1) time in the worst case. A perfect
hashed must be provided to create the hash function. In hash function can be used with limited range of values used
mathematical term, it is total injective function. This hash for efficient lookup operations, this can be done by placing
function is used in implementing lookup table with constant keys from subset S in lookup table indexed by function’s
worst-case access time. There are many hash functions that output. Then one can test whether key present in S, by looking
are like Perfect Hashing but the main advantage is that no at its cell of table and each lookup takes constant time in
collision resolution should be implemented in Perfect worst case.

144 Int'l Conf. Software Eng. Research and Practice | SERP'17 |

As discussed above, Perfect hashing is a technique for value of f(x) in within {0, 1, …, |S|2-1}. Thus, no matter x is
building hash table with no collisions. This is possible when in S or not, f(x) will always return a value in {0, 1, …, |S|2}.
all the keys are known in advance. Minimal Hashing means We then use the value of f(x) to index into a table T that
the resulting hash table contains only one entry for each stores the memory module number for f(x) value. Thus, if x is
known key and no empty slots exists. To insert keys to slots in S then we find the correct memory module that stores x.
two levels of hash functions are used [9]. First is H (key), Say x is in S and the memory module for store x is Ma. Then
hash the key that gets position in intermediate array G. The T[f(x)] =a. After we know memory module Ma, we then use a
second function, F (d, key) uses extra information of G to find hash function h for Ma to find the location h(x) of x in Ma. If x
unique position of the key. This scheme always returns value, is not in S. Then we will first use f(x)=a to find memory
if we know for sure that the key we are searching for is in the module Ma. We then use h(x) to locate x in Ma. Three
table. Otherwise, it returns bad information. situations can happen here. The first situation is that h(x)=h(y)
for a y in S. Thus h(x) and h(y) collides. Thus, we know that x
How the Intermediate Value can be found in Perfect is a less frequent key. We can then go to the memory module
Hashing? [9] Mb for storing less frequent keys and hash and rehash x there
to identify whether x is already in Mb or need to be inserted
1. We keep keys into buckets according to first hash into Mb. The second situation is that no key is at position h(x)
function, H (key). or x is stored at h(x) position of Ma. This can happen because
f(x) =f(y) for y in S and thus T[f(x)]=T[f(y)]=a and therefore
2. Then we process the buckets largest first and try to place we are going to go to the same memory module Ma for both of
all the keys it contains in an empty slot. If that is them. However, h(x)≠h(y). Thus, if h(x) position is vacant we
unsuccessful, we keep trying with successively larger then store x at h(x) position of Ma. If x is already in the h(x)
values of d. It sounds like it would take a long time, but it position of Ma then we found x in Ma. The third situation is
doesn't. Since we try to find the d value for the buckets that h(x)=h(y) for a y not in S while y has already been in the
with the most items early, they are likely to find empty position h(y) in Ma. In this situation, again x is a less frequent
spots. When we get to bucket with just one item, we can key and we need to go to the memory for less frequent keys to
simply place them into the next unoccupied spot. [5] locate x.

Also note that f(x) has |S|2 values and only |S| values
4 Use of Perfect Hashing to improve correspond to keys in S and the other |S|2-|S| values don’t
correspond to any key x in S. Thus these |S|2-|S| values for f(x)
Cuckoo Hashing correspond to less frequency keys. We can set T[f(x)] to
memory for storing less frequency keys. In this way frequently
occurred keys in S will be identified in constant time. For
4.1 Allocating Key to Memory those keys not in S their hash value may have collision with
We show how to use perfect hashing to improve Cuckoo the keys in S. Since these keys are less frequently occurred
hashing by considering the frequency of keys. We cannot and therefore we can afford more hashing and rehashing time
anticipate all possible keys because the set of keys is a huge for them. Where as in perfect hashing the hashing is fast and
set. For example, if keys are limited to no more than 20 letters hit ratio is high. In perfect hashing, all the keys in the subset S
then the set of keys has size 12720 which is a huge size set. is known. Initially hash f needs to be performed on each key
However, we can put known frequently encountered keys into which returns the frequency of the key and the memory
a set S and then map the keys in S to memory modules by module for it. Each key is assigned to memory module via the
using perfect hashing function f. Such a perfect hashing hash table value for it. Based on the frequency of respective
function can be obtained in O(|S|2b) time [7], where b is the key, the keys are stored in memory modules. The keys with
number of bits to represent a key in S. After f is obtained, f(x) highest frequency are stored in Memory-1 and the lower
for a key x can be computed in constant time [7]. In Cuckoo frequency keys are stored in next memory. If there are any
hashing every key is assumed to have the same priority. Here non-frequent words, then they can be stored in Separate
we analyze the set S of frequently encountered keys and store Memory.
high frequency keys together in a memory module. Because
there are few keys with high frequency and more keys with For example, let’s take below famous sentence stated by
less frequency we may, say, store keys with frequency above Fredrick P. Brooks Jr.
50% in memory module 0, store keys with frequency 20% to
50% in memory module 1, store keys with frequency 5% to “There is no single development, in either technology or
20% in memory module 2, and store the keys with frequency management technique, which by itself promises even one
less than 5% in memory module 3. order-of-magnitude improvement within a decade in
productivity, in reliability, in simplicity.”
The architecture of our scheme is, for an input key x,
first compute its perfect hash value f(x). According to [7] the

Int'l Conf. Software Eng. Research and Practice | SERP'17 | 145

From above sentence the frequently occurring word is ‘in’, Also explained how this mechanism is used to increase keys
that has count of 4 and all other words has count of ‘1’. As per hit ratio and to reduce memory usage. Key lookup in memory
our problem statement each keyword is hashed and distinct based on its frequency will be fast and new key insertion to
hash values from {0,1, 2………., n2} is assigned to each memory also becomes easy with this mechanism.
word. An index table is maintained to store both frequency
and Memory location for respective word. As ‘in’ is more 6 References
frequently occurred word, it is stored in Memory-1 and other
with less frequency are stored in Memory-2. Once the
hashing, memory allocation and updating of index table is [1] S. Pontarelli, P.Reviriego, J.A.Maestro, Parallel d-
completed for all words look up for any word in the memory Pipeline: A Cuckoo Hashing Implementation for Increased
becomes easy. Throughput, IEEE Transactions on Computers, vol. 65, 326-
331(2016).
4.2 Adding new Key to Memory
[2] P. Gupta and N. McKeown, “Algorithms for packet
All the keys are known in the set, so once the memory classification,” IEEE Network, vol. 15, no. 2, pp. 24–32,
allocation is completed any key can be looked up in memory 2001.
based on frequency. If a new key which is unknown has to be
stored in memory, first hashing is performed. If the key has [3] A. Kirsch, M. Mitzenmacher, and G. Varghese, “Hash-
highest frequency, then it is looked up in memory that stores based techniques for high-speed packet processing,” in
keys with high frequencies or if it has less frequency then it Algorithms for Next Generation Networks. London, U.K.:
will be looked up in memory with low frequencies. If the new Springer, 2010, pp. 181–218.
key is not a frequent key, then it will be stored in memory
which stores non-frequent keys. [4] R. Pagh and F. F. Rodler, “Cuckoo hashing,” Int. J.
Algorithms, vol. 51, no. 2, pp. 122–144, 2004.
Because of this mechanism, the hashing, key storage,
memory utilization and key look up is performed very [5] A. Broder and M. Mitzenmacher, “Using multiple hash
efficiently. When compared to Cuckoo Hashing this functions to improve IP lookups,” in Proc. 20th Annu. Joint
mechanism is more efficient as hashing is performed fast and Conf. IEEE Comput. Commun. Soc., 2001, vol. 3, pp. 1454–
cleverly follows memory utilization. To explain this 1463.
mechanism with example let us consider below table that
contains character’s list and its respective hashed frequencies. [6] R. Raman, The Power of Collision: Randomized Parallel
Algorithms for Chaining and Integer Sorting, Proceedings of
Table 1: Perfect Hashing Indexing Table the Tenth Conference on Foundations of Software
Technology and Theoretical Computer Science, 9-11(1990).
Character Hashed Frequency Memory
a 0.1 % Memory-2 [7] R. Raman. Priority queues: small, monotone and trans-
dichotomous. Proc. 1996 European Symp. on Algorithms,
b 0.4 % Memory-2 Lecture Notes in Computer Science 1136, 121-137(1996).
c 4% Memory-1
[8] https://en.wikipedia.org/wiki/Perfect_hash_function
d 5% Memory-1
[9] Belazzougui D., Botelho F.C., Dietzfelbinger M., Hash,
e 0.3 % Memory-2
Displace, and Compress. In: Fiat A., Sanders P. (eds)
Algorithms - ESA 2009. ESA 2009. Lecture Notes in
Computer Science, vol 5757. Springer, Berlin, Heidelberg,
As per above table Character c & d have highest 2009.
frequencies with 4% and 5% respectively. So, these two
characters are stored in Memory-1. Whereas characters a, b, e
has less frequencies with 0.1%, 0.4%, 0.3% respectively,
therefore these 3 keys are stored in next memory i.e. Memory-
2. The index table is maintained that stores hashed value of
each character and its respective Memory Location.

5 Conclusions
In this paper, we explained how perfect hashing can be
used to improve Cuckoo Hashing with frequency of keys.

MEP + Specialty Systems Design Standards (FOUR SEASONS)
92% (12)
MEP + Specialty Systems Design Standards (FOUR SEASONS)
304 pages
NITR 2 CuckooHashing
No ratings yet
NITR 2 CuckooHashing
16 pages
Two-Level Cuckoo Hashing: John Erol Evangelista
No ratings yet
Two-Level Cuckoo Hashing: John Erol Evangelista
21 pages
Cuckoo++ Hash Tables - High-Performance Hash Tables For Networking Applications - 2017 (1712.09624)
No ratings yet
Cuckoo++ Hash Tables - High-Performance Hash Tables For Networking Applications - 2017 (1712.09624)
13 pages
Hash Function
No ratings yet
Hash Function
9 pages
A Dynamic Perfect Hash Function Defined by an Extended Hash Indicator Table - 1984 (P245)
No ratings yet
A Dynamic Perfect Hash Function Defined by an Extended Hash Indicator Table - 1984 (P245)
10 pages
Colossion in Hasing
No ratings yet
Colossion in Hasing
22 pages
A New Algorithm For Constructing Minimal Perfect H
No ratings yet
A New Algorithm For Constructing Minimal Perfect H
13 pages
UNIT 1- Hashing
No ratings yet
UNIT 1- Hashing
118 pages
Module 5
No ratings yet
Module 5
25 pages
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
No ratings yet
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
32 pages
Hashing
No ratings yet
Hashing
8 pages
ds-5_removed
No ratings yet
ds-5_removed
16 pages
DSA_unit_!
No ratings yet
DSA_unit_!
123 pages
DSAU1HASH
No ratings yet
DSAU1HASH
21 pages
07-hashtables
No ratings yet
07-hashtables
4 pages
Exercises For Advanced Algorithms WS 20/21: Institut Für Informatik Abteilung 1
No ratings yet
Exercises For Advanced Algorithms WS 20/21: Institut Für Informatik Abteilung 1
2 pages
Unit 1 Dsa Hashing
No ratings yet
Unit 1 Dsa Hashing
137 pages
CO4 - Hashing in Data Structure
No ratings yet
CO4 - Hashing in Data Structure
13 pages
Brics: Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions
No ratings yet
Brics: Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions
14 pages
DS 8
No ratings yet
DS 8
30 pages
Dat Astruc T Hashing Rep
No ratings yet
Dat Astruc T Hashing Rep
13 pages
mod 5
No ratings yet
mod 5
13 pages
Week13 1
No ratings yet
Week13 1
16 pages
Some Open Questions Related To Cuckoo Hashing
No ratings yet
Some Open Questions Related To Cuckoo Hashing
10 pages
Hashing 1
No ratings yet
Hashing 1
26 pages
Unit 1 Dsa Hashing 2024 1
No ratings yet
Unit 1 Dsa Hashing 2024 1
146 pages
ADS M TECH MID 2
No ratings yet
ADS M TECH MID 2
26 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
How To Implement Adictionary?: University Institute of Engineering (UIE)
No ratings yet
How To Implement Adictionary?: University Institute of Engineering (UIE)
25 pages
1994 - Graphs, Hypergraphs and Hashing
No ratings yet
1994 - Graphs, Hypergraphs and Hashing
13 pages
Lesson-9-Hashing
No ratings yet
Lesson-9-Hashing
22 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
Hashing PDF
No ratings yet
Hashing PDF
1 page
UNIT 1- Hashing
No ratings yet
UNIT 1- Hashing
118 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Collision Resolution Techniques in Hash
No ratings yet
Collision Resolution Techniques in Hash
6 pages
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
No ratings yet
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
19 pages
Unit28 Hashing1
No ratings yet
Unit28 Hashing1
19 pages
Perfect Hash Table-Based Telephone Directory
100% (2)
Perfect Hash Table-Based Telephone Directory
62 pages
Unit 1 Dsa Hashing 2022 Compressed 1
No ratings yet
Unit 1 Dsa Hashing 2022 Compressed 1
115 pages
C10 - Hashing
No ratings yet
C10 - Hashing
11 pages
Hashing
No ratings yet
Hashing
75 pages
HASHING
No ratings yet
HASHING
8 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
CS143: Hash Index
No ratings yet
CS143: Hash Index
26 pages
CS235102 Data Structures
No ratings yet
CS235102 Data Structures
46 pages
A2
No ratings yet
A2
2 pages
DSA Unit 1
No ratings yet
DSA Unit 1
144 pages
Study_Material_on_Hashing
No ratings yet
Study_Material_on_Hashing
4 pages
L07
No ratings yet
L07
24 pages
Hashing
No ratings yet
Hashing
4 pages
MODULE-5
No ratings yet
MODULE-5
33 pages
Unit-3 Hashing Storage Btree
No ratings yet
Unit-3 Hashing Storage Btree
26 pages
unit 1 Hashing
No ratings yet
unit 1 Hashing
61 pages
Ch-2: Abstract Data Structures
No ratings yet
Ch-2: Abstract Data Structures
8 pages
Hashing and Skiplist_removed
No ratings yet
Hashing and Skiplist_removed
113 pages
CO3 Notes Hashing
No ratings yet
CO3 Notes Hashing
10 pages
Static and Dynamic Hashing.docx
No ratings yet
Static and Dynamic Hashing.docx
3 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
The Natural History Book - The Ultimate Visual Guide To Everything On Earth
No ratings yet
The Natural History Book - The Ultimate Visual Guide To Everything On Earth
27 pages
4G Optimization and KPI Analysis - Telecom Hub
No ratings yet
4G Optimization and KPI Analysis - Telecom Hub
7 pages
VPN
No ratings yet
VPN
10 pages
NN46205-605 02.05 Administration
No ratings yet
NN46205-605 02.05 Administration
490 pages
Email and SMS Relay Using Talk2M eWON 2005 - 4005
No ratings yet
Email and SMS Relay Using Talk2M eWON 2005 - 4005
5 pages
Introduction To GSM
No ratings yet
Introduction To GSM
3 pages
Web Development in Pakistan
No ratings yet
Web Development in Pakistan
5 pages
Edirol FA-66 Quick Start Guide
No ratings yet
Edirol FA-66 Quick Start Guide
9 pages
Envoy Proxy and VPP Based IPSEC Concentrator For SASE/SSE
No ratings yet
Envoy Proxy and VPP Based IPSEC Concentrator For SASE/SSE
25 pages
H3C UIS HCI Solution for Education 20250210
No ratings yet
H3C UIS HCI Solution for Education 20250210
29 pages
Matrix Eternity-PENX
No ratings yet
Matrix Eternity-PENX
8 pages
Guardian™ Serial Radio Modem: User Manual
No ratings yet
Guardian™ Serial Radio Modem: User Manual
71 pages
Downloaded From Manuals Search Engine
No ratings yet
Downloaded From Manuals Search Engine
105 pages
Transforming Data
No ratings yet
Transforming Data
21 pages
QNO-8080R - Hanwha Vision
No ratings yet
QNO-8080R - Hanwha Vision
14 pages
BRKARC-2881
No ratings yet
BRKARC-2881
108 pages
Goldelox Dos Commands Sis
No ratings yet
Goldelox Dos Commands Sis
16 pages
CX2006v6.0 Sentinel DS
No ratings yet
CX2006v6.0 Sentinel DS
5 pages
238 Service Manual - Aspire 5732z 5332
No ratings yet
238 Service Manual - Aspire 5732z 5332
227 pages
Fiscalization of Novitus Tims Esd
No ratings yet
Fiscalization of Novitus Tims Esd
4 pages
FTN Compro 20200721
No ratings yet
FTN Compro 20200721
24 pages
RS232 Serial Cable Pinout Information
No ratings yet
RS232 Serial Cable Pinout Information
5 pages
AWS Cheat Sheet
100% (1)
AWS Cheat Sheet
3 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
1.2.6 Module 1 - Lesson 2 - Quiz
No ratings yet
1.2.6 Module 1 - Lesson 2 - Quiz
5 pages
FG1100E Vs PAN 3260
No ratings yet
FG1100E Vs PAN 3260
6 pages
Easy Smart Switch - UG PDF
No ratings yet
Easy Smart Switch - UG PDF
51 pages
RV Tools
No ratings yet
RV Tools
80 pages
Airbridge BSC6680 Maintenance Manual - Troubleshooting
No ratings yet
Airbridge BSC6680 Maintenance Manual - Troubleshooting
10 pages

Improving Cuckoo Hashing With Perfect Hashing

Uploaded by

Improving Cuckoo Hashing With Perfect Hashing

Uploaded by

Int'l Conf. Software Eng.

Research and Practice | SERP'17 | 143

Improving Cuckoo Hashing with Perfect Hashing

Hashing. In this paper, we use Perfect Hashing to improve

ISBN: 1-60132-468-5, CSREA Press ©

ISBN: 1-60132-468-5, CSREA Press ©

ISBN: 1-60132-468-5, CSREA Press ©

You might also like