Cryptography and
Network Security
Chapter 12
Fourth Edition
by William Stallings
Lecture slides by Lawrie Brown
Chapter 12 – Hash and MAC
Algorithms
Each of the messages, like each one he had ever
read of Stern's commands, began with a number
and ended with a number or row of numbers. No
efforts on the part of Mungo or any of his experts
had been able to break Stern's code, nor was
there any clue as to what the preliminary number
and those ultimate numbers signified.
—Talking to Strange Men, Ruth Rendell
Hash function
A cryptographic hash function is an
algorithm that takes an arbitrary amount of
data input—a credential—and produces a
fixed-size output of enciphered text called
a hash value, or just “hash.
That enciphered text can then be stored
instead of the password itself, and later
used to verify the user.
Cont .
A hash function is a mathematical function
that converts a numerical input value into
another compressed numerical value.
The input to the hash function is of
arbitrary length (any length) but output is
always of fixed length.
Values returned by a hash function are
called message digest or simply hash
values
Structure of hash function
Features of hash function
The typical features of hash functions are −
• Fixed Length Output (Hash Value)
• Hash function coverts data of arbitrary length to a fixed length. This
process is often referred to as hashing the data.
• In general, the hash is much smaller than the input data, hence hash
functions are sometimes called compression functions.
• Since a hash is a smaller representation of a larger data, it is also
referred to as a digest.
• Hash function with n bit output is referred to as an n-bit hash function.
• Popular hash functions generate values between 160 and 512 bits.
Cont.
Efficiency of Operation
Generally for any hash function h with input x,
computation of h(x) is a fast operation.
Computationally hash functions are much
faster than a symmetric encryption.
Properties of hash function
In order to be an effective cryptographic tool, the hash
function is desired to possess following properties −
Pre-Image Resistance
This property means that it should be computationally hard to
reverse a hash function.
In other words, if a hash function h produced a hash value z,
then it should be a difficult process to find any input value x that
hashes to z.
This property protects against an attacker who only has a hash
value and is trying to find the input.
Cont.
Second Pre-Image Resistance
This property means given an input and its hash, it
should be hard to find a different input with the same
hash.
In other words, if a hash function h for an input x
produces hash value h(x), then it should be difficult to
find any other input value y such that h(y) = h(x).
This property of hash function protects against an
attacker who has an input value and its hash, and
wants to substitute different value as legitimate value
in place of original input value.
Cont.
Collision Resistance
This property means it should be hard to find two different inputs of
any length that result in the same hash. This property is also
referred to as collision free hash function.
In other words, for a hash function h, it is hard to find any two
different inputs x and y such that h(x) = h(y).
Since, hash function is compressing function with fixed hash length,
it is impossible for a hash function not to have collisions. This
property of collision free only confirms that these collisions should
be hard to find.
This property makes it very difficult for an attacker to find two input
values with the same hash.
Popular Hash Functions
Message Digest (MD)
MD5 was most popular and widely used hash function for quite
some years.
The MD family comprises of hash functions MD2, MD4, MD5 and
MD6. It was adopted as Internet Standard RFC 1321. It is a 128-bit
hash function.
MD5 digests have been widely used in the software world to provide
assurance about integrity of transferred file. For example, file
servers often provide a pre-computed MD5 checksum for the files,
so that a user can compare the checksum of the downloaded file to
it.
In 2004, collisions were found in MD5. An analytical attack was
reported to be successful only in an hour by using computer cluster.
This collision attack resulted in compromised MD5 and hence it is
no longer recommended for use.
Cont.
Secure Hash Function (SHA)
Family of SHA comprise of four SHA algorithms; SHA-0, SHA-1,
SHA-2, and SHA-3. Though from same family, there are structurally
different.
The original version is SHA-0, a 160-bit hash function, was published by
the National Institute of Standards and Technology (NIST) in 1993. It
had few weaknesses and did not become very popular. Later in 1995,
SHA-1 was designed to correct alleged weaknesses of SHA-0.
SHA-1 is the most widely used of the existing SHA hash functions. It is
employed in several widely used applications and protocols including
Secure Socket Layer (SSL) security.
In 2005, a method was found for uncovering collisions for SHA-1 within
practical time frame making long-term employability of SHA-1 doubtful.
SHA-2 family has four further SHA variants, SHA-224, SHA-256, SHA-
384, and SHA-512 depending up on number of bits in their hash value.
No successful attacks have yet been reported on SHA-2 hash function.
Cont.
The RIPEMD is an acronym for RACE Integrity Primitives
Evaluation Message Digest. This set of hash functions was
designed by open research community and generally known as a
family of European hash functions.
The set includes RIPEMD, RIPEMD-128, and RIPEMD-160. There also exist
256, and 320-bit versions of this algorithm.
Original RIPEMD (128 bit) is based upon the design principles used in MD4 and
found to provide questionable security. RIPEMD 128-bit version came as a quick
fix replacement to overcome vulnerabilities on the original RIPEMD.
RIPEMD-160 is an improved version and the most widely used version in the
family. The 256 and 320-bit versions reduce the chance of accidental collision,
but do not have higher levels of security as compared to RIPEMD-128 and
RIPEMD-160 respectively.
Whirlpool
This is a 512-bit hash function.
It is derived from the modified version of
Advanced Encryption Standard (AES). One of
the designer was Vincent Rijmen, a co-creator
of the AES.
Three versions of Whirlpool have been
released; namely WHIRLPOOL-0,
WHIRLPOOL-T, and WHIRLPOOL.
Applications of Hash Functions
There are two direct applications of hash function
based on its cryptographic properties
Password Storage
Hash functions provide protection to password
storage.
Instead of storing password in clear, mostly all
logon processes store the hash values of
passwords in the file.
The Password file consists of a table of pairs
which are in the form (user id, h(P))
Cont.
Data Integrity Check
Data integrity check is a most common
application of the hash functions. It is used
to generate the checksums on data files.
This application provides assurance to the
user about correctness of the data.
The integrity check helps the user to detect any changes made to
original file.
It however, does not provide any assurance about originality.
The attacker, instead of modifying file data, can change the entire file
and compute all together new hash and send to the receiver. This
integrity check application is useful only if the user is sure about the
originality of file.
Message Authentication Code
(MAC)
Message Authentication Code (MAC), also
referred to as a tag, is used to
authenticate the origin and nature of a
message.
MACs use authentication cryptography to
verify the legitimacy of data sent through a
network or transferred from one person to
another.
STRUCTURE OF MAC
Hash and MAC Algorithms
Hash Functions
condense arbitrary size message to fixed
sizeby processing message in blocks
through some compression function
either custom or block cipher based
Message Authentication Code (MAC)
fixed sized authenticator for some message
to provide authentication for message
by using block cipher mode or hash function
Characteristics of a Good
Hash Function
There are four main characteristics of a good
hash function:
The hash value is fully determined by the data
being hashed.
The hash function uses all the input data.
The hash function "uniformly" distributes the data
across the entire set of possible hash values.
The hash function generates very different hash
values for similar strings.
Hash Algorithm Structure
Secure Hash Algorithm
SHA originally designed by NIST & NSA in 1993
was revised in 1995 as SHA-1
US standard for use with DSA signature scheme
standard is FIPS 180-1 1995, also Internet RFC3174
nb. the algorithm is SHA, the standard is SHS
based on design of MD4 with key differences
produces 160-bit hash values
recent 2005 results on security of SHA-1 have
raised concerns on its use in future applications
Revised Secure Hash
Standard
NIST issued revision FIPS 180-2 in 2002
adds 3 additional versions of SHA
SHA-256, SHA-384, SHA-512
designed for compatibility with increased
security provided by the AES cipher
structure & detail is similar to SHA-1
hence analysis should be similar
but security levels are rather higher
SHA-512 Overview
SHA-512 Compression
Function
heart of the algorithm
processing message in 1024-bit blocks
consists of 80 rounds
updating a 512-bit buffer
using a 64-bit value Wt derived from the
current message block
and a round constant based on cube root of
first 80 prime numbers
SHA-512 Round Function
SHA-512 Round Function
Whirlpool
now examine the Whirlpool hash function
endorsed by European NESSIE project
uses modified AES internals as
compression function
addressing concerns on use of block
ciphers seen previously
with performance comparable to dedicated
algorithms like SHA
Whirlpool Overview
Whirlpool Block Cipher W
designed specifically for hash function use
with security and efficiency of AES
but with 512-bit block size and hence hash
similar structure & functions as AES but
input is mapped row wise
has 10 rounds
a different primitive polynomial for GF(2^8)
uses different S-box design & values
Whirlpool Block Cipher W
Whirlpool Performance &
Security
Whirlpool is a very new proposal
hence little experience with use
but many AES findings should apply
does seem to need more h/w than SHA,
but with better resulting performance
Keyed Hash Functions as MACs
want a MAC based on a hash function
because hash functions are generally faster
code for crypto hash functions widely
available
hash includes a key along with message
original proposal:
KeyedHash = Hash(Key|Message)
some weaknesses were found with this
eventually led to development of HMAC
HMAC
specified as Internet standard RFC2104
uses hash function on the message:
HMACK = Hash[(K+ XOR opad) ||
Hash[(K+ XOR ipad)||M)]]
where K+ is the key padded out to size
and opad, ipad are specified padding constants
overhead is just 3 more hash calculations than
the message needs alone
any hash function can be used
eg. MD5, SHA-1, RIPEMD-160, Whirlpool
HMAC Overview
HMAC Security
proved security of HMAC relates to that of
the underlying hash algorithm
attacking HMAC requires either:
brute force attack on key used
birthday attack (but since keyed would need
to observe a very large number of messages)
choose hash function used based on
speed verses security constraints
CMAC
previously saw the DAA (CBC-MAC)
widely used in govt & industry
but has message size limitation
can overcome using 2 keys & padding
thus forming the Cipher-based Message
Authentication Code (CMAC)
adopted by NIST SP800-38B
CMAC Overview
Summary
have considered:
some current hash algorithms
• SHA-512 & Whirlpool
HMAC authentication using hash function
CMAC authentication using a block cipher