L15 Maps and Hashes
L15 Maps and Hashes
Associative Array
What is a Map
A map models a searchable collection of key-value entries
The main operations of a map are for searching, inserting, and deleting items
Applications:
● address book
● student-record database
You may see them also called: associative array, map, symbol table, or dictionary
Entry ADT
An entry stores a key-value pair
(k,v)
Methods:
put(k, v): if there is no entry with key k, insert entry (k, v), and otherwise set its
value to v. Return an iterator to the new/modified entry
Interface
int size() const; // number of entries
bool empty() const; // is the map empty?
Iterator find(const K& k) const; // find entry with key k
Iterator put(const K& k, const V& v); // insert or replace
Find
for each p in [S.begin(), S.end()) {
if p->key() == k {
return p
}
}
Map via a List (Pseudocode) return S->end() // there is no entry with key equal to k
}
Algorithm put(k,v) {
for each p in [S.begin(), S.end()) {
if p->key() == k {
Put
p->setValue(v)
return p
}
}
p = S.insertBack((k,v)) // there is no entry with key k
Map via a List (Pseudocode) n = n + 1 // increment number of entries
return p
}
Algorithm erase(k) {
Erase
for each p in [S.begin(), S.end()) {
if p.key() == k {
S.erase(p)
n = n – 1 // decrement number of entries
}
Map via a List (Pseudocode) }
}
Problems with the Unsorted List Map
Performance:
● put() takes O(n) time since we need to determine whether it is already in the
sequence
● find() and erase() take O(n) time since in the worst case (the item is not found)
● We traverse the entire sequence to look for an item with the given key
The unsorted list implementation is effective only for small size or for maps in which puts
are the most common operations
● Where searches and removals are rarely performed since we may find a key early in
the sequence
● Example: historical record of logins to a system
Hash Tables for Maps
Data Structures: Hash Tables
Learn the basics of Hash Tables, one of the most useful data structures for solving interview questions. This video is a part of HackerRank's Cracking The Coding Interview Tutorial with
Gayle Laakmann McDowell. http://www.hackerrank.com/domains/tutorials/cracking-the-coding-interview?utm_source=videoutm_medium=youtubeutm_campaign=ctci
Maps with Hash Tables
A hash function h maps keys of a given type to integers in a fixed interval [0, N
− 1]
● Hash function h
● Array (called table) of size N
When implementing a hash table, the goal is to store item (k, v) at index i =
h(k)
Example of Hash Tables
We design a hash table for a map storing
entries as (SSN, Name),
where SSN (social security number) is a nine-
digit positive integer
If the keys are integers well distributed in the range [0, N − 1], this bucket array
is all that is needed.
● Thus, searches, insertions, and removals in the bucket array take O(1) time.
Drawbacks:
● The space used is proportional to N and if we don’t have many entries, we waste space
● Keys are required to be integers in the range [0, N − 1], which is often not the case.
We normally use a bucket array in conjunction with a “good” mapping from the keys to the integers
Hash Functions
A hash function is usually specified as the composition of two functions:
The hash code is applied, and then compression function is applied on the result
The goal of the hash function is to disperse the keys in a seemingly random way
Hash Code Function
Memory address:
Integer cast:
Component sum:
● We partition the bits of the key into components of fixed length (e.g., 16 or 32 bits)
● We sum the components (ignoring overflows)
● Suitable for fixed lengths greater than or equal to the number of bits of the integer type
Hash Compression Function
Division: h2(y) = y mod N
1 4
2 8
// K1 Compress // K2 Compress
3 12
{ {
4 16 0: [0, 4, 8], 0=>[0, 4, 8, 12, 16, 20, 24, 28, 32, 36],
1: [1, 5, 9], 1=>[],
5 20 2: [2, 6], 2=>[],
6 24
3: [3, 7] 3=>[]
} }
7 28
8 32
9 36
Division Compressions and Primes
K1 K2 Hash Function: h(k) = k % N where N = 7
0 0
1 4
// K1 Compress // K2 Compress
2 8
{ {
3 12 0=>[0, 7], 0=>[0, 28],
1=>[1, 8], 1=>[8, 36],
4 16 2=>[2, 9], 2=>[16],
5 20
3=>[3], 3=>[24],
4=>[4], 4=>[4, 32],
6 24 5=>[5], 5=>[12],
6=>[6] 6=>[20]
7 28
} }
8 32
9 36
Collision Handling
Collisions occur when different elements are
mapped to the same cell
𝐴[ 𝑖 + 1 𝑚𝑜𝑑 𝑁]
Quadratic probing:
● The interval between probes increases quadratic (hence, the indices are described by a
quadratic function, opposed to linear probing with increases by a fixed interval)
𝐴 𝑖 + 𝑓 𝑗 𝑚𝑜𝑑𝑁 , 𝑓𝑜𝑟 𝑗 = 0,1,2, … , 𝑁, 𝑤ℎ𝑒𝑟𝑒𝑓 𝑗 = 𝑗 2
Double hashing:
● The interval between probes is fixed for each record but is computed by another hash
function
Interface
void erase(const K& k); // remove entry key k
void erase(const Iterator& p); // erase entry at p
Iterator begin(); // iterator first entry
Iterator end(); // iterator end entry
HashTable Map // Some functions left out and utilities (next slide)
protected:
typedef std::list<Entry> Bucket; // a bucket of entries
typedef std::vector<Bucket> BktArray; // a bucket array
private:
int n; // number of entries
H hash; // the hash comparator
BktArray B; // bucket array
};
// find utility
Iterator finder(const K& k);
// insert utility
Iterator inserter(const Iterator& p, const Entry& e);
// remove utility
void eraser(const Iterator& p);
// entry iterator
HashTable Map typedef typename Bucket::iterator EItor;
// end of bucket?
static bool endOfBkt(const Iterator& p) {
return p.ent == p.bkt->end();
}
// a (key, value) pair
template <typename K, typename V>
class Entry {
Public:
// constructor
Entry Class
Entry(const K& k = K(),const V& v = V()) :_key(k),_value(v) {}
Iterator ==
// ba (Bucket Array) or bkt (Bucket) differ?
if (ba != p.ba || bkt != p.bkt) return false;
// at end of bucket?
if (endOfBkt(*this)) {
Iterator ++
// go to next bucket
++bkt;
HashMap
HashMap<K,V,H>::HashMap(int capacity) : n(0), B(capacity) { }
// number of entries
Functions
template <typename K, typename V, typename H>
int HashMap<K,V,H>::size() const {
return n;
}
// iterator to front
template <typename K, typename V, typename H>
// search for k
Find
while (!endOfBkt(p) && (*p).key() != k) { nextEntry(p); }
return p; // return final position
}
// find key
// remove entry at p
Erase
template <typename K, typename V, typename H>
void HashMap<K,V,H>::erase(const Iterator& p) {
eraser(p);
}