MMC Module 3
MMC Module 3
r
• Entropy encoding is lossless and independent of the type
of information that is being compressed.
3.1 Introduction
a
• e.g. run-length encoding
• Compression is generally applied in multimedia
00000001111100011 ⇒ 7,5,3,2 ⇒ 111,101,011,010
communication to reduce the volume of information to
h
be transmitted or to reduce the bandwidth that is required • e.g. statistical encoding
for its transmission.
• Statistical encoding uses a set of variable-length
codewords with the shorter codewords used to
S
3.2 compression principles represent the more frequently occurring symbols.
• The theoretical minimum average number of bits that
• Lossless compression:
are required to transmit a particular source stream is
• The aim is to reduce the number of bits used to known as the entropy of the source.
represent the source information in such a way that
n
there is no loss of information. • Entropy is defined as H = − ∑ Pi log 2 Pi , where n is the
• Reversible i =1
number of different symbols in the source stream and
• Lossy compression: Pi is the probability of occurrence of symbol i.
• The aim is to reduce the number of bits used to
• For a particular codebook, the average number of bits
represent the source information by introducing n
acceptable loss of information. per codeword is given by − ∑ N i Pi , where Ni is the bit
i =1
th
length of the i codeword of the codebook.
CYH/MMT/CmpTI/p.1 CYH/MMT/CmpTI/p.2
k
3.2.2 Source coding
r
• It exploits a particular property of the source information
in order to produce an alternative form of representation
that is either a compressed version of the original form
a
or is more amenable to the application of compression.
h
• e.g. Transform encoding
S
3.3 Text compression
• Any compression algorithm associated with text must be
lossless since the loss of just a single character could
modify the meaning of a complete string.
CYH/MMT/CmpTI/p.3 CYH/MMT/CmpTI/p.4
k
Static Huffman coding
a r
S h
CYH/MMT/CmpTI/p.5 CYH/MMT/CmpTI/p.6
k
3.4 Image compression
r
• A lossless compression algorithm must be used when
transferring graphical images.
a
• Two different schemes are used for the compression of
bit-map images:
(1) based on a combination of run-length and statistical
h
encoding (lossless). (e.g. fax)
(2) based on a combination of transform, differential
and, run-length encoding (loss). (e.g. JPG)
S
• The most widely adopted image coding standard has
been developed by an international standards body
known as the Joint Photographic Experts Group (JPEG).
CYH/MMT/CmpTI/p.7 CYH/MMT/CmpTI/p.8
k
JPEG coding
a r
S h
CYH/MMT/CmpTI/p.9 CYH/MMT/CmpTI/p.10
k
• The source image is divided into 8x8 blocks, and
r
transformed into the frequency domain using the FDCT.
a
F (i, j ) = ×
4
7 7 ( 2 x + 1)iπ ( 2 y + 1) jπ
∑ ∑ p ( x, y ) cos cos
h
x =0 y =0 16 16
1 / 2 for i = 0
where c(i ) = for 0 ≤ i, j < 8
1 else
S
• F(0,0) is known as DC coefficient.
CYH/MMT/CmpTI/p.11 CYH/MMT/CmpTI/p.12
k
• Huffman encoder converts the DCT coefficients using 2
r
steps:
a
• each ac coefficient is represented by a pair of
symbols:
h
i. symbol-1 (RUNLENGTH, SIZE),
ii. symbol-2 (AMPLITUDE)
S
RUNLENGTH = the number of consecutive zero-
valued ac coefficients preceding the
• The dc coefficients are encoded using the predictive nonzero ac coefficient ∈ [0,15]
coding techniques
there is usually a strong correlation between dc SIZE = the number of bits used to encode
coefficients of adjacent 8x8 blocks AMPLITUDE ∈ [0 to 10 bits]
• entropy coding : provide additional compression by AMPLITUDE = in the range of [-1023, 1024]
encoding the quantized DCT coefficients into more
compact form e.g. if the sequence of ac coefficients is
0, 0, 0, 0, 0, 0, 476
• Two entropy coding methods are specified: ⇒ (6, 9)(476)
Huffman coding and arithmetic coding
• if RUNLENGTH > 15,
symbol-1 (15,0) ⇒ RUNLENGTH = 16
CYH/MMT/CmpTI/p.13 CYH/MMT/CmpTI/p.14
k
e.g. (15,0)(15,0)(7,4)(12) • Categories assigned to AC coefficient values/ DC
⇒ RUNLENGTH =
r
39 coefficient differences:
SIZE = 4
AMPLITUDE = 12 Size Amplitude range
0* 0
a
1 (-1,1)
• (0,0) ⇒ End Of Block (EOB)
2 (-3,-2)(2,3)
3 (-7..-4)(4..7)
• for dc coefficients:
h
4 (-15..-8)(8..15)
5 (-31..-16)(16..31)
symbol-1 (SIZE) 6 (-63..-32)(32..63)
symbol-2 (AMPLITUDE) 7 (-127..-64)(64..127)
S
the range of dc coefficients = [-2048, 2047] 8 (-255..-128)(128..255)
9 (-511..-256)(256..511)
10 (-1023..-512)(512..1023)
2. converting intermediate symbol sequence into binary 11* (-2047..-1024)(1024..2047)
*
sequence using Huffman tables These 2 categories are not used for coding AC
coefficient values.
• symbols are replaced with variable length codes: dc
coefficient, then ac coefficients • Compression ratio:
• each symbol-1 encoded with a Variable-Length original data size
CR =
Code (VLC) compressed data size
• symbol-2 are encoded using a Variable-Length • Average number of bits per pixel:
Integer (VLI) code encoded number of bits
Nb =
number of pixels
e.g. (1,4)(12) → (1111101011100)
CYH/MMT/CmpTI/p.15 CYH/MMT/CmpTI/p.16
k
Sequential JPEG Encoding Example (f) Zig-zag sequence
r
Encoding a Single Block 61, -3,4,-1,-4,2,0,2,-2,0,0,0,0,0,2,0,0,0,1,0,0,0,
0,0,0,-1,0,0,-1,0,0,0,0,-1,0,0,0,0,0,0,0,-1,0,0,0,
(a) Original 8x8 block (c) Quantization table 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
a
(quality=2)
(g) Intermediate symbol sequence
140 144 147 140 140 155 179 175 3 5 7 9 11 13 15 17
h
144 152 140 147 140 148 167 179 5 7 9 11 13 15 17 19
(6)(61),(0,2)(-3),(0,3)(4),(0,1)(-1),(0,3)(-4),
152 155 136 167 163 162 152 172 7 9 11 13 15 17 19 21
168 145 156 160 152 155 136 160 9 11 13 15 17 19 21 23
(0,2)(2), (1,2)(2),(0,2)(-2),( 5,2)(2),(3,1)(1),
162 148 156 148 140 136 147 162 11 13 15 17 19 21 23 25 (6,1)(-1),(2,1)(-1),(4,1)(-1),(7,1)(-1),(0,0)
S
147 167 140 155 155 140 136 162 13 15 17 19 21 23 25 27
136 156 123 167 162 144 140 147 15 17 19 21 23 25 27 29 (h) Encoded bit sequence (total 98 bits)
148 155 136 155 152 147 147 136 17 19 21 23 25 27 29 31
(1110)(111101),(01)(00),(100)(100),(00)(0),
(100)(011),(01)(10),(11011)(10),(01)(01),
(b) Block after FDCT (d) Block after
(11111110111)(10),(111010)(1),
quantization
(1111011)(0),(11100)(0),(111011)(0),
185 -17 14 -8 23 -9 -13 -18 61 -3 2 0 2 0 0 -1 (11111010)(0),(1010)
20 -34 26 -9 -10 10 13 6 4 -4 2 0 0 0 0 0
-10 -23 -1 6 -18 3 -20 0 -1 -2 0 0 -1 0 -1 0
-8 -5 14 -14 -8 -2 -3 8 0 0 1 0 0 0 0 0
-3 9 7 1 -11 17 18 15 0 0 0 0 0 0 0 0
3 -2 -18 8 8 -3 0 -6 0 0 -1 0 0 0 0 0
8 0 -2 3 -1 -7 -1 -1 0 0 0 0 0 0 0 0
0 -7 -2 1 1 4 -6 0 0 0 0 0 0 0 0 0
r
luminance difference:
(SIZE) Code Word
0 00
a
1 010
2 011
3 100
4 101
h
5 110
6 1110
7 11110
8 111110
S
9 1111110
10 11111110