Module IV
Module IV
Prepared by
K.Indragandhi,AP(Sr.Gr.)/ECE
1
Module-IV
IMAGE COMPRESSION
2
• Definition: Compression means storing data in a
format that requires less space than usual.
3
• The bandwidth of a digital communication link
can be effectively increased by compressing data
at the sending end and decompressing data at the
receiving end.
4
Types of Data Compression
5
• In a Lossless data compression file the original
message can be exactly decoded.
6
Compression Algorithms
• Huffman Coding
• Shift Codes
• Arithmetic Codes
• Transform codes
• Vector Quantization
7
Huffman Coding
8
• The idea is to use short bit strings to represent
the most frequently used characters
9
• That is, the most common characters, usually space, e,
and t are assigned the shortest codes.
10
11
Huffman Algorithm
12
• Merge the two into a single tree with a new root
node whose left and right sub trees are the two
we chose.
13
Huffman Coding Example
• Character frequencies
– A: 20% (.20)
– B: 9% (.09)
– C: 15%
– D: 11%
– E: 40%
– F: 5%
14
Huffman Code
E BF D A C
.4 .14 .11 .20 .15
0 1
B F
.09 .05
15
Huffman Code ABCDEF
1.0
0 1
• Codes ABCDF E
– A: 010 0
.6
1
.4
– B: 0000
BFD AC
– C: 011 .25 .35
0 1 0 1
– D: 001
BF D A C
– E: 1 .14 .11 .20 .15
– F: 0001 0 1
B F
.09 .05
• Note
– None are prefixes of another
16
Huffman Coding
• TENNESSEE • ENCODING
9 • E:1
0/ \1
• S : 00
5 e(4)
• T : 010
0/ \1
• N : 011
s(2) 3
0/ \1
Average code length = (1*4 +
t(1) n(2)
2*2 + 3*2 + 3*1) / 9 = 1.89
17
Average Code Length
= 17 / 9 = 1.89
18
ENTROPY
Entropy is a measure of information content:
the more probable the message, the lower its
information content, the lower its entropy
19
Advantages & Disadvantages
20
• Though Huffman coding is inefficient due to using
an integral number of bits per code , it is
relatively easy to implement and very efficient for
coding and decoding.
21
Run-length encoding
22
• Run-length encoding (RLE) is a very simple form of data
compression encoding.
23
• This intuitive principle works best on certain data types
in which sequences of repeated data values can be
noticed;
24
• RLE may be used on any kind of data regardless of its content, but data which is
being compressed by RLE determines how good compression ratio will be
achieved.
• RLE is used on text files which contains multiple spaces for indention and
formatting paragraphs, tables and charts.
• Digitized signals also consist of unchanged streams so such signals can also be
compressed by RLE.
25
• Fair compression ratio may be achieved if RLE is
applied on computer generated color images.
26
Example1
•
WWWWWWWWWWWWBWWWWWWWWWWWW
BBB
WWWWWWWWWWWWWWWWWWWWWWWW
BWWWWWWWWWWWWWW
• 12WB12W3B24WB14W
27
Shift code:
A shift code is generated by
• Arranging the source symbols so that their probabilities
are monotonically decreasing
•Dividing the total number of symbols into symbol blocks
of equal size.
•Coding the individual elements within all blocks
identically, and
•Adding special shift-up or shift-down symbols to identify
each block. Each time a shift-up or shift-down symbol is
recognized at the decoder, it moves one block up or down
with respect to a pre-defined reference block.
28
Arithmetic coding
•Unlike the variable-length codes described previously,
arithmetic coding, generates non-block codes.
29
• The code word itself defines an interval of real numbers between
0 and 1
30
• The idea behind arithmetic coding is to have a
probability line, 0-1
31
Example
a 2 [0.0 , 0.5)
b 1 [0.5 , 0.75)
c 1 [0.7.5 , 1.0)
32
Algorithm to compute the output number
• Low = 0
• High = 1
• Loop. For all the symbols.
Range = high - low
High = low + range * high_range of
the symbol being coded
Low = low + range * low_range of the symbol
being coded
33
Symbol Range Low value High value
0 1
b 1 0.5 0.75
35
0.16
0.08
36
So, any number in the interval [0.06752,0.0688) , for example
0.068 can be used to represent the message.
Decode 0.39.
Since 0.8>code word > 0.4, the first symbol should be a3.