0% found this document useful (0 votes)
9 views

Module IV

Uploaded by

Safrin Fathima.M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Module IV

Uploaded by

Safrin Fathima.M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Digital Image Processing

Prepared by
K.Indragandhi,AP(Sr.Gr.)/ECE

1
Module-IV
IMAGE COMPRESSION

2
• Definition: Compression means storing data in a
format that requires less space than usual.

• Data compression is particularly useful in


communications because it enables devices to
transmit the same amount of data in fewer bits.

3
• The bandwidth of a digital communication link
can be effectively increased by compressing data
at the sending end and decompressing data at the
receiving end.

• There are a variety of data compression


techniques, but only a few have been
standardized.

4
Types of Data Compression

• There are two main types of data compression :


Lossy and Lossless.

• In Lossy data compression the message can never be


recovered exactly as it was before it was
compressed.

5
• In a Lossless data compression file the original
message can be exactly decoded.

• Lossless compression is ideal for text.

• Huffman coding is type of lossless data


compression.

6
Compression Algorithms
• Huffman Coding

• Run Length Encoding

• Shift Codes

• Arithmetic Codes

• Block Truncation Codes

• Transform codes

• Vector Quantization

7
Huffman Coding

• Huffman coding is a popular compression technique that


assigns variable length codes (VLC) to symbols, so that
the most frequently occurring symbols have the shortest
codes.

• On decompression the symbols are reassigned their


original fixed length codes.

8
• The idea is to use short bit strings to represent
the most frequently used characters

• and to use longer bit strings to represent less


frequently used characters.

9
• That is, the most common characters, usually space, e,
and t are assigned the shortest codes.

• In this way the total number of bits required to transmit


the data can be considerably less than the number
required if the fixed length ASCII representation is used.

• A Huffman code is a binary tree with branches assigned


the value 0 or 1.

10
11
Huffman Algorithm

• To each character, associate a binary tree consisting of


just one node.

• To each tree, assign the character’s frequency, which


is called the tree’s weight.

• Look for the two lightest-weight trees. If there are


more than two, choose among them randomly.

12
• Merge the two into a single tree with a new root
node whose left and right sub trees are the two
we chose.

• Assign the sum of weights of the merged trees as


the weight of the new tree.

• Repeat the previous step until just one tree is left.

13
Huffman Coding Example
• Character frequencies
– A: 20% (.20)
– B: 9% (.09)
– C: 15%
– D: 11%
– E: 40%
– F: 5%

• No other characters in the document

14
Huffman Code

E BF D A C
.4 .14 .11 .20 .15
0 1
B F
.09 .05

15
Huffman Code ABCDEF
1.0
0 1
• Codes ABCDF E
– A: 010 0
.6
1
.4

– B: 0000
BFD AC
– C: 011 .25 .35
0 1 0 1
– D: 001
BF D A C
– E: 1 .14 .11 .20 .15
– F: 0001 0 1
B F
.09 .05
• Note
– None are prefixes of another

16
Huffman Coding
• TENNESSEE • ENCODING
9 • E:1
0/ \1
• S : 00
5 e(4)
• T : 010
0/ \1
• N : 011
s(2) 3
0/ \1
Average code length = (1*4 +
t(1) n(2)
2*2 + 3*2 + 3*1) / 9 = 1.89

17
Average Code Length

Average code length =


i=0,n (length*frequency)/ i=0,n frequency

= { 1(4) + 2(2) + 3(2) + 3(1) } /(4+2+2+1)

= 17 / 9 = 1.89

18
ENTROPY
Entropy is a measure of information content:
the more probable the message, the lower its
information content, the lower its entropy

Entropy = -i=1,n (pi log2 pi)


( p - probability of the symbol)

= - ( 0.44 * log20.44 + 0.22 * log20.22


+ 0.22 * log20.22 + 0.11 * log20.11 )
= - (0.44 * log0.44 + 2(0.22 * log0.22 + 0.11 * log0.11)
/ log2
= 1.8367

19
Advantages & Disadvantages

• The problem with Huffman coding is that it uses


an integral number of bits in each code.

• If the entropy of a given character is 2.5 bits,the


Huffman code for that character must be either 2
or 3 bits , not 2.5.

20
• Though Huffman coding is inefficient due to using
an integral number of bits per code , it is
relatively easy to implement and very efficient for
coding and decoding.

• It provides the best approximation for coding


symbols when using fixed width codes.

21
Run-length encoding

22
• Run-length encoding (RLE) is a very simple form of data
compression encoding.

• RLE is a lossless type of compression

• It is based on simple principle of encoding data. This


principle is to every stream which is formed of the same
data values (repeating values is called a run) i.e sequence
of repeated data values is replaced with count number
and a single value.

23
• This intuitive principle works best on certain data types
in which sequences of repeated data values can be
noticed;

• RLE is usually applied to the files that a contain large


number of consecutive occurrences of the same byte
pattern.

24
• RLE may be used on any kind of data regardless of its content, but data which is
being compressed by RLE determines how good compression ratio will be
achieved.

• RLE is used on text files which contains multiple spaces for indention and
formatting paragraphs, tables and charts.

• Digitized signals also consist of unchanged streams so such signals can also be
compressed by RLE.

• A good example of such signal are monochrome images, and questionable


compression would be probably achieved if such compression was used on
continous-tone (photographic) images.

25
• Fair compression ratio may be achieved if RLE is
applied on computer generated color images.

• RLE is a lossless type of compression and cannot


achieve great compression ratios,

• but a good point of that compression is that it


can be easily implemented and quickly executed.

26
Example1

WWWWWWWWWWWWBWWWWWWWWWWWW
BBB
WWWWWWWWWWWWWWWWWWWWWWWW
BWWWWWWWWWWWWWW

• If we apply a simple run-length code to the above hypothetical scan line, we ge


the following:

• 12WB12W3B24WB14W
27
Shift code:
A shift code is generated by
• Arranging the source symbols so that their probabilities
are monotonically decreasing
•Dividing the total number of symbols into symbol blocks
of equal size.
•Coding the individual elements within all blocks
identically, and
•Adding special shift-up or shift-down symbols to identify
each block. Each time a shift-up or shift-down symbol is
recognized at the decoder, it moves one block up or down
with respect to a pre-defined reference block.

28
Arithmetic coding
•Unlike the variable-length codes described previously,
arithmetic coding, generates non-block codes.

•In arithmetic coding, a one-to-one correspondence between


source symbols and code words does not exist.

•Instead, an entire sequence of source symbols (or message)


is assigned a single arithmetic code word.

•Arithmetic coding, is entropy coder widely used, the only


problem is it's speed, but compression tends to be better than
can achieve

29
• The code word itself defines an interval of real numbers between
0 and 1

• As the number of symbols in the message increases, the interval


used to represent it becomes smaller and the number of
information units (say, bits) required to represent the interval
becomes larger

• Each symbol of the message reduces the size of the interval in


accordance with the probability of occurrence.

• It is suppose to approach the limit set by entropy.

30
• The idea behind arithmetic coding is to have a
probability line, 0-1

• assign to every symbol a range in this line based on its


probability

• higher the probability, the higher range which assigns


to it.

• Once we have defined the ranges and the probability


line, start to encode symbols

• every symbol defines where the output floating point


number lands

31
Example

Symbol Probability Range

a 2 [0.0 , 0.5)

b 1 [0.5 , 0.75)

c 1 [0.7.5 , 1.0)

32
Algorithm to compute the output number

• Low = 0
• High = 1
• Loop. For all the symbols.
Range = high - low
High = low + range * high_range of
the symbol being coded
Low = low + range * low_range of the symbol
being coded
33
Symbol Range Low value High value

0 1

b 1 0.5 0.75

a 0.25 0.5 0.625

c 0.125 0.59375 0.625

a 0.03125 0.59375 0.609375

The output number will be 0.59375


34
Arithmetic coding

Let the message to be encoded be a1a2a3a3a4

35
0.16

0.8 0.072 0.0688

0.08

0.4 0.04 0.056 0.0624 0.06496

0.2 0.048 0.0592 0.06368

36
So, any number in the interval [0.06752,0.0688) , for example
0.068 can be used to represent the message.
Decode 0.39.
Since 0.8>code word > 0.4, the first symbol should be a3.

1.0 0.8 0.72 0.432

0.8 0.72 0.648 0.8

0.4 0.36 0.504 0.4

0.2 0.28 0.432 0.2

0.0 0.2 0.36 37


0.36

You might also like