Unit-4 (Second Half)
Unit-4 (Second Half)
4.1.1Types Of Compression
Compression and decompression techniques are utilized for a number of applications, such as facsimile
system, printer systems, document storage and retrieval systems, video teleconferencing systems, and
electronic multimedia messaging systems. An important standardization of compression algorithm was
achieved by the CCITT when it specified Group 2 compression for facsimile system. .
When information is compressed, the redundancies are removed.
Sometimes removing redundancies is not sufficient to reduce the size of the data object to manageable
levels. In such cases, some real information is also removed. The primary criterion is that removal of the
real information should not perceptly affect the quality of the result. In the case of video, compression
causes some information to be lost; some information at a delete level is considered not essential for a
reasonable reproduction of the scene. This type of compression is called lossy compression. Audio
compression, on the other hand, is not lossy. It is called lossless compression.
Lossless Compression.
In lossless compression, data is not altered or lost in the process of compression or decompression.
Decompression generates an exact replica ofthe original object. Text compression is a good example of
lossless compression. The repetitive nature of text, sound and graphic images allows replacement of
repeated strings of characters or bits by codes. Lossless compression techniques are good for text data and
for repetitive data in images all like binary images and gray-scale images.
Some of the commonly accepted lossless standards are given below:
Packpits encoding (Run-length encoding)
CCITT Group 3 I D
CCITT Group 3 2D
CCITT Group 4
Lempe l-Ziv and Welch algorithm LZW.
Lossy compression is that some loss would occur while compressing information objects.
Lossy compression is used for compressing audio, gray-scale or color images, and video objects in which
1
IT-6501 Graphics and Multimedia Unit-4
2
IT-6501 Graphics and Multimedia Unit-4
White Black
Run Code Run Code
Length Word Length Word
0 00110101 0 0000110111
1 000111 1 010
2 0111 2 11
3 1000 3 10
4 1011 4 011
5 1100 5 0011
6 1110 6 0010
7 1111 7 00011
8 10011 8 000101
9 10100 9 000100
\ 1
10 00111 10 0000100
11 01000 10 0000100
11 01000 11 0000101
12 001000 12 0000111
13 000011 13 00000100
14 110100 14 00000111
15 110101 15 000011000
16 101010 16 0000010111
17 101011 17 0000011000
18 0100111 18 0000001000
19 0001100 19 0000 11 00 III
20 0001000 20 00001101000
21 0010111 21 00001101100
22 0000011 22 00000110111
23 0000100 23 00000101000
24 0101000 24 00000010111
25 0101011 25 00000011000
26 0010011 26 000011001010
27 0100100 27 000011001011
28 0011000 28 000011 00 11 00
29 00000010 29 000011001101
3
IT-6501 Graphics and Multimedia Unit-4
30 00000011 30 000001101000
31 00011010 31 000001101001
32 00011011 32 000001101010
33 00010010 33 000001101011
34 00010011 34 000011010010
35 00010100 35 000011 0 10011
For example, from Table 2, the run-length code of 16 white pixels is 101010, and of 16 black pixels
0000010111. Statistically, the occurrence of 16 white pixels is more frequent than the occurrence of 16
black pixels. Hence, the code generated for 16 white pixels is much shorter. This allows for quicker
decoding. For this example, the tree structure could be constructed.
36 00010101 36 000011010100
37 00010110 37 000011010101
38 000101 II 38 000011010110
9 00101000 39 000011 0 1 0 1 1 1
40 00101001 40 000001I01100
41 00101010 41 000001101101
42 00101011 42 000011011010
43 00101100· 43 0000 11 0 1 1011
44 00101101 44 000001010100
45 00000100 45 000001010101
46 00000101 46 000001010110
47 00001010 47 000001010111
48 00001011 48 000001100100
49 01010010 49 000001100101
50 010100II 50 000001010010
51 01010100 51 000001010011
52 01010101 52 000000100100
53 00100100 53 000000110111
The codes greater than a string of 1792 pixels are identical for black and white pixels. A new code indicates
reversal of color, that is, the pixel Color code is relative to the color of the previous pixel sequence.
Table 3 shows the codes for pixel sequences larger than 1792 pixels.
Run Length Make-up Code
(Black and White)
1792 00000001000
1856 00000001100
1920 00000001101
1984 000000010010
2048 000000010011
2112 000000010100
2176 000000010101
2240 000000010110
2304 000000010111
2368 000000011100
4
IT-6501 Graphics and Multimedia Unit-4
2432 000000011101
2496 000000011110
2560 000000011111
CCITT Group 3 compression utilizes Huffman coding to generate a set of make-up codes and a set of
terminating codes for a given bit stream. Make-up codes are used to represent run length in multiples of 64
pixels. Terminating codes are used to represent run lengths of less than 64 pixels.
As shown in Table 2; run-length codes for black pixels are different from the run-length codes for white
pixels. For example, the run-length code for 64 white pixels is 11011. The run length code for 64 black
pixels is 0000001111. Consequently, the run length of 132 white pixels is encoded by the following two
codes:
Makeup code for 128 white pixels - 10010
Terminating code for 4 white pixels - 1011
The compressed bit stream for 132 white pixels is 100101011, a total of nine bits. Therefore the
compression ratio is 14, the ratio between the total number of bits (132) divided by the number of bits used
to code them (9).
CCITT Group 3 uses a very simple data format. This consists of sequential blocks of data for each scanline,
as shown in Table 4.
Coding tree for 16 white pixels
EOL DATA FILL EOL DATA FILL EOL… DATA FILL EOL EOL EOL
LINE LINE LINE
1 2 n
Note that the file is terminated by a number of EOLs (End of. Line) if there is no change in the line [rom the
previous line (for example, white space).
TABLE 4: CCITT Group 3 1D File Format
Advantages of CCITT Group 3 ID
CCITT Group 3 compression has been used extensively due to the following two advantages: It
is simple to implement in both hardware and software .
It is a worldwide standard for facsimile which is accepted for document imaging application. This allows
document imaging applications to incorporate fax documents easily.
CCITT group 3 compressions utilizes Huffman coding to generate a set of make-up codes and a set
of terminating codes for a give bit stream.
CCITT Group 3 uses a very simply data format. This consists of sequential blocks of data for each
scanline.
5
IT-6501 Graphics and Multimedia Unit-4
6
IT-6501 Graphics and Multimedia Unit-4
Color Characteristics
We typically define color by its brightness, the hue and depth of color.
Luminance or Brightness
This is the measure of the brightness of the light emitted or reflected by an object; it depends on the radiant,
energy of the color band.
Hue This is the color sensation produced in an observer due to the presence of certain wavelengths of
color. Each wavelength represents a different hue.
Saturation This is a measure of color intensity, for example, the difference between red and pink. Color
Models Several calm' models have been developed to represent color mathematically. Chromacity Model
It is a three-dimensional model with two dimensions, x and y, defining the color, and the third dimension
defining the luminance. It is an additive model since x and yare added to generate different colors.
RGBModel RGB means Red Green Blue. This model implements additive theory in that different
intensities of red, green and blue are added to generate various colors.
HSI Model The Hue Saturation and Intensity (HSI) model represents an artist's impression of tint, shade
and tone. This model has proved suitable for image processing for filtering and smoothing images. CMYK
Model The Cyan, Magenta, Yellow and Black color model is used in desktop publishing printing devices. It
is a color-subtractive model and is best used in color printing devices only.
YUV Representation The NTSC developed the YUV three-dimensional color model. y -
Luminance Component UV -Chrominance Components.
Luminance component contains the black and white or grayscale information. The chrominance component
contains color information where U is red minus cyan and V is megenta minus green.
YUV Model for JPEG
The JPEG compression scheme uses several stages.
The first stage converts the signal from the spatial RGB domain to the YUV frequency domain by
performing discrete cosine transform. This process allows separating luminance or gray-scale components
from the chrominance components of the image.
7
IT-6501 Graphics and Multimedia Unit-4
8
IT-6501 Graphics and Multimedia Unit-4
redundancy by transforming data from a spatial domain to a frequency domain; the quantizer quantizes DCT
co-efficients with weighting functions to generate quantized DCT co-efficients optimized for the human
eye; and the entropy encoder minimizes the entropy of quantized DCT co-efficients.
The JPEG method is a symmetric algorithm. Here, decompression is the exact reverse process of
compression.
Figure below describes a typical DCT based encoder and decoder. Symmetric Operation of DCT based
Codec
Figure below shows the components and sequence of quantization 5 * 8 Image blocks
9
IT-6501 Graphics and Multimedia Unit-4
Quantization
Quantization is a process of reducing the precision of an integer, thereby reducing the number of bits
required to store the integer, thereby reducing the number of bits required to store the integer.
The baseline JPEG algorithm supports four color quantization tables and two huffman tables for both DC
and AC DCT co-efficients. The quantized co-efficient is described by the following equation:
DCT (i, j)
Quantized Co-efficient (i, j) = Quantum(i, j)
ZigZag Sequence
Run-length encoding generates a code to represent the Count of zero-value OCT co-efficients. This process
of run-length encoding gives an excellent compression of the block consisting mostly of zero values.
Further empirical work proved that the length of zero values in a run can be increased to give a further
10
IT-6501 Graphics and Multimedia Unit-4
increase in compression by reordering the runs. JPEG came up with ordering the quantized OCT co-
efficients in a ZigZag sequence. ZigZag sequence the sequence in which the cells are encoded.
Entropy Encoding
Entropy is a term used in thermodynamics for the study of heat and work. Entropy, as used in data
compression, is the measure of the information content of a message in number of bits. It is represented as
Entropy in number of bits = log2 (probability of Object)
Huffman versus Arithmetic coding
Huffman coding requires that one or more sets of Huffman code tables be specified by the application for
coding as well as decoding. For arithmetic coding JPEG does not require coding tables.It able to adapt to the
image statistics as it encodes the image.
DC coefficient coding
Before DC coefficients are compressed the DC prediction is processed first.In DC prediction the DC
coefficient of the previous 8x8 block is subtracted from the current 8x8 block.
Two 8x8 blocks of a quantized matrix are shown in figure2.6. The Differential DC coefficient is delta
D=DCx - DCx-1.
AC coefficient coding
Each AC coefficient is encoded by utilizing two symbols symbol-1 and symbol-2. Symbol-1 represents two
11
IT-6501 Graphics and Multimedia Unit-4
piece of information called “run length” and “size”. Symbol-2 represents the amplitude of the AC
coefficient.
12
IT-6501 Graphics and Multimedia Unit-4
The H.261 is designed for dynamic use and provides a fully contained organization and a high level of
interactive control.
Moving Picture Experts Group Compression
The MPEG standards consist of a number of different standards.
The MPEG 2 suite of standards consist of standards for MPEG2 Video, MPEG - 2 Audio and MPEG - 2
systems. It is also defined at different levels, called profiles.
The main profile is designed to cover the largest number of applications. It supports digital video
compression in the range of2 to 15 M bits/sec. It also provides a generic solution for television worldwide,
including cable, direct broadcast satellite, fibre optic media, and optical storage media (including digital
VCRs).
MPEG Coding Methodology
The above said requirements can be achieved only by incremental coding of successive frames. It is known
as interframe coding. If we access information randomly by frame requires coding confined to a specific
frame, then it is known as intraframe coding.
The MPEG standard addresses these two requirements by providing a balance between interframe coding
and intraframe coding. The MPEG standard also provides for recursive and non-recursive temporal
redundancy reduction.
The MPEG video compression standard provides two basic schemes: discrete-transform-based compression
for the reduction of' spatial redundancy and block-based motion compensation for the reduction of temporal
(motion) redundancy. During the initial stages of DCT compression, both the full motion MPEG and still
image JPEG algorithms are essentially identical. First an image is converted to the YUVcolor space (a
luminance/chrominance color space similar to that used for color television). The pixel data is then fed into
a discrete cosine transform, which creates a scalar quantization (a two-dimensional array representing
various frequency ranges represented in the image) of the pixel data.
Following quantization, a number of compression algorithms are applied, including run-length and Huffman
encoding. For full motion video (MPEG I and 2), several more levels of block based motion-compensated
techniques are applied to reduce temporal redundancy with both causal and noncausal coding to further
reduce spatial redundancy.
The MPEG algorithm for spatial reduction is lossy and is defined as a hybrid which employs motion
compensation, forward discrete cosine transform (DCF), a uniform quantizer, and Huffman coding. Block-
based motion compensation is utilized for reducing temporal redundancy (i.e. to reduce the amount of data
needed to represent each picture in a video sequence). Motion-compensated reduction is a key feature of
MPEG.
13
IT-6501 Graphics and Multimedia Unit-4
Let us review the concept of Macroblocks and understand the role they play in compression
MACRO BLOCKS
For the video coding algorithm recommended by CCITT, CIF and QCIF are diviqed into a hierarchical
block structure consisting of pictures, groups of blocks (GOBs), Macro Blocks(MBs), and blocks. Each
picture frame is divided into 16 x 16 blocks. Each Macroblock is composed of four 8 x 8 (Y) luminance
blocks and two 8 x 8 (Cb and Cn) chrominance blocks. This set of six blocks, called a macroblock; is the
basic hierarchical component used for achieved a high level of compression.
Motion compensation
Motion compensation is the basis for most compression algorithms for visual telephony and full-motion
video. Motion compensation assumes that the current picture is some translation of a previous picture. This
creates the opportunity for using prediction and interpolation. Prediction requires only the current frame and
the reference frame.
Based on motion vectors values generated, the prediction approach attempts to find the relative new position
of the object and confirms it by comparing some block exhaustively. In the interpolation approach, the
motion vectors are generated in relation to two reference frames, one from the past and the next predicted
frame.
The best-matching blocks in both reference frames are searched, and the average is taken as the position of
the block in the current frame. The motion vectors for the two reference, frames are averaged.
Picture Coding Method
In this coding method, motion compensation is applied bidirectionally. In MPEG terminology, the motion-
compensated units are called macro blocks (MBs).
MBs are 16 x 16 blocks that contain a number of 8 x 8 luminance and chrominance blocks. Each 16 x 16
macro block can be of type intrapicture, forward-predicted, backward predicted, or average.
MPEG Encoder
Figure below shows the architecture of an MPEG encoder. It contains DCT quantizer, Huffman coder and
Motion compensation. These represent the key modules in the encoder.
14
IT-6501 Graphics and Multimedia Unit-4
encoding. For full-motion video, several more levels of motion compensation compression and coding are
applied.
MPEG -2
It is defined to include current television broadcasting compression and decompression needs, and attempts
to include hooks for HDTV broadcasting.
The MPEG-2 Standard Supports:
1.Video Coding: * MPEG-2 profiles and levels.
2.Audio Coding:*MPEG-l audio standard fro backward compatibility.
* Layer-2 audio definitions for MPEG-2 and stereo sound.
* Multichannel sound.
3. Multiplexing: MPEG-2 definitions
MPEG-2, "The Grand Alliance"
It consists of following companies AT&T, MIT, Philips, Sarnoff Labs, GI Thomson, and Zenith.
The MPEG-2committee and FCC formed this alliance. These companies together have defined the advanced
digital television system that include the US and European HDTV systems. The outline of the advanced
digital television system is as follows:
1.Format: 1080/2: 1160 or 720/1.1160
2.Video coding: MPEG-2 main profile and high level
3.Audio coding: Dolby AC3
4.Multiplexor: As defined in MPEG-2
Modulation: 8- VSB for terrestrial and 64-QAM for cable.
Vector Quantization
Vector quantization provides a multidimensional representation of information stored in look-up tables,
vector quantization is an efficient pattern-matching algorithm in which an image is decomposed into two or
more vectors, each representing particular features of the image that are matched to a code book of vectors.
These are coded to indicate the best fit.
In image compression, source samples such as pixels are blocked into vectors so that each vector describes
a small segment or sub block of the original image.
The image is then encoded by quantizing each vector separately
Intel's Indeo Technology
It is developed by Intel Architecture Labs Indeo Video is a software technology that reduces the size of
uncompressed digital video files from five to ten times.
Indeo technology uses multiple types of 'lossy' and 'lossless' compression techniques.
where L - approaches 0,
N(L) ~ number of stick L, and L is the length of the stick.
15
IT-6501 Graphics and Multimedia Unit-4
TIFF Structure
TIFF files consists of a header. The header consists of byte ordering flag, TIFF file format version number,
and a pointer to a table. The pointer points image file directory. This directory contains table of entries of
various tags and their information.
TIFF file format Header:
16
IT-6501 Graphics and Multimedia Unit-4
The next figure shows the IFD (Image File Directory) as its content. The IFD is avariable –length table
containing directory entries. The length of table depends on the number of directory entries in the table. The
first two bytes contain the total number of entries in the table followed by directory entrie. Each directory
entry consists of twelve bytes.The Last item in the IFD is a four byte pointer that points to the next IFD.
The byte content of each directory entry is as follows:
The first two byte contains tag number-tag ID.
The second two byte represent the type of data as shown in table3-1 below.
The next four bytes contains the length for the data type.
The final four bytes contain data or a pointer.
TIFF Tags
The first two bytes of each directory entry contain a field called the Tag ID.
Tag IDs arc grouped into several categories. They are Basic, Informational, Facsimile, Document storage
and Retrieval.
TIFF Classes: (Version 5.0)It has five classes
1. Class B for binary images
2. Class F for Fax
3. Class G for gray-scale images
4. Class P for palette color images
17
IT-6501 Graphics and Multimedia Unit-4
The sub chunk contains a four-character ASCII string 10 to identify the type of data.
Four bytes of size contains the count of data values, and the data. The data structure of a chunk is same as
all other chunks.
RIFF chunk with two sub chunk:
The first 4 characters of the RlFF chunk are reserved for the "RIFF" ASCII string. The next four bytes
define the total data size.
18
IT-6501 Graphics and Multimedia Unit-4
The first four characters of the data field are reserved for form tyPe. The rest of the data field contains two
subchunk:
(i) fmt ~ defines the recording characteristics of the waveform.
(ii) data ~ contains the data for the waveform.
LIST Chunk
RlFF chunk may contains one or more list chunks.
List chunks allow embedding additional file information such as archival location, copyright information,
creating date, description of the content of the file.
RlFF MIDI FILE FORMAT
RlFF MIDI contains a RlFF chunk with the form type "RMID"and a subchunk called "data" for MIDI data.
The 4 bytes are for ID of the RlFF chunk. 4 bytes are for size 4 bytes are for form type 4
bytes are for ID of the subchunk data and 4 bytes are for the size of MIDI data.
RIFF DIBS (Device-Independent Bit Maps) .
DIB is a Microsoft windows standard format. It defines bit maps and color attributes for bit maps
independent of devices. DIEs are normally embedded in .BMP files, .WMF meta data files, and .CLP files.
DIB Structure
BITMAPINFOHEADER RGBQUAD PIXELS
A RIFF DIB file format contains a RIFF chunk with the Form Type "RDIB" and a subchunk called "data"
for DIB data.
4 bytes denote ID of the RIFF chunk
4 bytes refer size ofXYZ.RDI 4 bytes define Forum Type
4 bytes describe ID of the sub chunk data 4 bytes define size of DIB data.
RIFF PALETTE File format
The RIFF Palette file format contains a RIFF chunk with the Form Type "RP AL" and a subchunk called
"data" for palette data. The Microsoft Windows logical palette structure is enveloped in the RIFF data
subchunk. The palette structure contains the palette version number, number of palette entries, the intensity
of red, green and blue colours, and flags for the palette usage. The palette structure is described by the
following code segment:
19
IT-6501 Graphics and Multimedia Unit-4
20
IT-6501 Graphics and Multimedia Unit-4
The MIDI file format also contains chunks (i.e., blocks) of data. There are two types of chunks: (i) header
chunks (ii) track chunks.
Header Chunk
It is made up of 14 bytes .
The first four-character string is the identifier string, "MThd" .
The second four bytes contain the data size for the header chunk. It is set to a fixed value of six bytes.
The last six bytes contain data for header chunk.
Track chunk
The Track chunk is organized as follows:
.:. The first 4-character string is the identifier.
.:. The second 4 bytes contain track length.
21
IT-6501 Graphics and Multimedia Unit-4
other information. Here, a standard file format is generated which can be moved across platforms and
applications.
JPEG Motion Image:
JPEG Motion image will be embedded in A VI RIFF file format.
There are two standards available:
(i) MPEG ~ In this, patent and copyright issues are there.
(ii) MPEG 2 ~ It provide better resolution and picture quality.
4.2.5 TWAIN
A standard interface was designed to allow application to interface with different types of input
devices such as scanners, digital still cameras, and so on, using a generic TWAIN interface without
creating device- specific driver. The benefits of this approach areas follows
I. Application developers can code to a single TWAIN specification that allows application to
interface to all TWAIN complaint input devices.
2. Device manufactures can write device drivers for their proprietary devices and, by complying
to the TWAIN specification , allow the devices to be used by all TWAIN-compliant
applications
The Twain architecture defines a set of application programming interfaces (APls) and a protocol to
acquire data from input devices.
It is a layered architecture.
It has application layer, the protocol layer, the acquisition layer and device layer.
22
IT-6501 Graphics and Multimedia Unit-4
Application Layer: A TWAIN application sets up a logical connection with a device. TWAIN does not
impose any rules on the design of an application. However, it set guidelines for the user interface to select
sources (logical device) from a given list of logical devices and also specifies user interface guidelines to
acquire data from the selected sources.
The Protocol Layer: The application layer interfaces with the protocol layer. The protocol layer is
responsible for communications between the application and acquisition layers. The protocol layer
does not specify the method of implementation of sources, physical connection to devices, control of
devices , and other device-related functionality. This clearly highlights that applications are independent
of sources. The heart of the protocol layer, as shown in Figure is the Source Manager. It manages all
sessions between an application and the sources, and monitors data acquisition transactions.
The functionality of the Source Manager is as follows:
Provide a standard API for all TWAIN compliant sources
Provides election of sources for a user from within an application
Establish logical sessions between applications and sources, and also manages essions between
multiple applications and multiple sources
Act as a traffic cop to make sure that transactions and communication are routed to appropriate
sources, and also validate all transactions
Keep track of sessions and unique session identities
Load or unload sources as demanded by an application
Pass all return code from the source to the application
Maintain a default source
The Acquisition Layer: The acquisition layer contains the virtual device driver, it interacts directly
with the device driver. This virtual layerisalsocalledthesource.Thesourcecanbelocalandlogicallyconnected
to a local device, or remote and logically connected to a remote device(i.e.,a device ove rthe network).
The source performs the following functions:
~ Control of the device.
~ Acquisition of data from the device.
~ Transfer of data in agreed (negotiated) format. This can be transferred in native format or
another filtered format.
~ Provision of a user interface to control the device.
The Device Layer: The purpose of the device driver is to receive software commands and control the
device hardware accordingly. This is generally developed by the device manufacturer and shipped with
the device.
NEW WAVE RIFF File Format: This format contains two subchunks:
(i) Fmt (ii) Data.
It may contain optional subchunks:
(i) Fact
(ii) Cue points
(iii)Play list
(iv) Associated datalist.
Fact Chunk: It stores file-dependent information about the contents of the WAVE file.
Cue Points Chunk: It identifies a series of positions in the waveform data stream.
Playlist Chunk: It specifies a play order for series of cue points.
Associated Data Chunk: It provides the ability to attach information, such as labels, to sections of the
waveform data stream.
Inst Chunk: The file format stores sampled sound synthesizer's samples.
23
IT-6501 Graphics and Multimedia Unit-4
The scanner acts as the camera eye and take a photograph of the document, creating an unaltered electronic
pixel representation of the original.
Sound and Voice: When voice or music is captured by a microphone, it generates an electrical signal. This
electrical signal has analog sinusoidal waveforms. To digitize, this signal is converted into digital voice
using an analog-to-digital converter.
Full-Motion Video: It is the most important and most complex component of Multimedia System. Video
Cameras are the primary source of input for full-motion video.
. Pen Driver: It is a pen device driver that interacts with the digitizer to receive all digitized information
about the pen location and builds pen packets for the recognition context manager. Recognition context
manager: It is the main part of the pen system. It is responsible for co-ordinating windows pen applications
with the pen. It works with Recognizer, dictionary, and display driver to recognize and display pen drawn
objects.
Recognizor: It recognizes hand written characters and converts them to ASCII.
Dictionary: A dictionary is a dynamic link library (DLL); The windows form pen computing system uses
this dictionary to validate the recognition results.
Display Driver: It interacts with the graphics device interface' and display hardware. When a
user starts writing or drawing, the display driver paints the ink trace on the screen.
Video and Image Display Systems Display System Technologies
There are variety of display system technologies employed for decoding compressed data for displaying.
Mixing and scaling technology: For VGA screen, these technologies are used.
VGA mixing: Images from multiple sources are mixed in the image acquisition memory.
VGA mixing with scaling: Scalar ICs are used to sizing and positioning of images in predefined windows.
Dual buffered VGA mixing/Scaling: If we provide dual buffering, the original image is prevented from
loss. In this technology, a separate buffer is used to maintain the original image.
Visual Display Technology Standards
MDA: Monochrome Display Adapter.
It was introduced by IBM . displays 80 x 25 rows and columns .
:. It could not display bitmap graphics .
:. It was introduced in 1981.
CGA: Color Graphics Adapter .
:. It was introduced in 1981.
.:. It was designed to display both text and bitmap graphicsi
it supported RGB color display,
.:. It could display text at a resolution of 640 x 200 pixels .
:. It displays both 40 x 25 and 80 x 25 row!' and columns of text characters.
MGA: Monochrome Gr.aphics Adapter .
:. It was introduced in 1982 .
:. It could display both text and graphics .
:. It could display at a resolution 720 x 350 for text and
720 x 338 for Graphics . MDA is compatible mode for this standard.
EGA: Enhanced Graphics Adapter .
:. It was introduced in 1984 .
:. It emulated both MDt. and CGA standards .
:. It allowed the display of both text and graphics in 16 colors at a
resolution of 640 x· 350 pixels.
PGA: Professional Graphics Adapter.
.:. It was introduced in 1985 .
:. It could display bit map graphics at 640 x 480 resolution and 256 colors .
• :. Compatible mode of this standard is CGA.
VGA: Video Graphics Array . :. It was introduced by IBM in 1988 .
:. It offers CGA and EGA compatibility .
:. It display both text and graphics .
24
IT-6501 Graphics and Multimedia Unit-4
25
IT-6501 Graphics and Multimedia Unit-4
of it.
Types of Scanners
A and B size Scanners, large form factor scanners, flat bed scanners, Rotory drum scanners and hand held
scanners are the examples of scanners.
Charge-Coupled Devices All scanners use charge-coupled devices as their photosensors. CCDs consists of
cells arranged in a fixed array on a small square or rectangular solid state surface. Light source moves across
a document. The intensity of the light reflected by the mirror charges those cells. The amount of charge is
depending upon intensity of the reflected light, which depends on the pixel shade in the document.
Image Enhancement Techniques
HalfTones In a half-tone process, patterns of dots used to build .scanned or printed image create the
illusion of continuous shades of gray or continuous shades of color. Hence only limited number of shades
are created. This process is implemented in news paper printers.
But in black and white photograph or color photograph, almost infinite levels of tones are used.
Dithering
Dithering is a process in which group of pixels in different patterns are used to approximate halftone
patterns by the scanners. It is used in scanning original black and white photographs.
Image enhancement techniques includes controls of brightness, deskew (Automatically corrects page
alignment), contrast, sharpening, emphasis and cleaning up blacknoise dots by software.
Image Manipulation
It includes scaling, cropping and rotation.
Scaling: Scaling can be up or down, the scaling software is available to reduce or enlarge. This software
uses algorithms.
Cropping: To remove some parts of the image and to put the rest of the image as the subset of the old
image.
Rotation: Image could be rotated at any degree for displaying it in different angles.
Sampling process
Sampling is a process where the analog signal is sampled over time at regular intervals to obtain the
amplitude of the analog signal at the sampling time.
Sampling rate
26
IT-6501 Graphics and Multimedia Unit-4
The regular interval at which the sampling occurs is called the sampling rate.
Digital Voice
Speech is analog in nature and is cOl1veli to digital form by an analog-to-digital converter (ADC). An ADC
takes an input signal from a microphone and converts the amplitude of the sampled analog signal to an 8, 16
or 32 bit digital value.
The four important factors governing the
ADC process are sampling rate resolution linearity and conversion speed.
Sampling Rate: The rate at which the ADC takes a sample of an analog signal. Resolution:
The number of bits utilized for conversion determines the resolution of ADC.
Linearity: Linearity implies that the sampling is linear at all frequencies and that the amplitude tmly
represents the signal.
Conversion Speed: It is a speed of ADC to convert the analog signal into Digital signals. It must be fast
enough.
VOICE Recognition System
Voice Recognition Systems can be classified into three types.
1.Isolated-word Speech Recognition.
2.Connected-word Speech Recognition.
3.Continuous Speech Recognition.
27
IT-6501 Graphics and Multimedia Unit-4
28
IT-6501 Graphics and Multimedia Unit-4
MIDI Interconnections
The MIDI IN port of an instrument receives MIDI ncssages to play the instrument's internal synthesizer.
The MIDI OUT port sends MIDI messages to play these messages to an external synthesizer. The MIDI
THRU port outputs MIDI messages received by the MIDI IN port for daisy-chaining external synthesizers.
29
IT-6501 Graphics and Multimedia Unit-4
Mode messages are used for assigning voice relationships for up to 16 channels; that is, to set the device to
MOWO mode or POLY mode. Omny Mode on enables the device to receive voice messages on all
channels.
System Messages
System messages apply to the complete system rather than specific channels and do not contain any channel
numbers. There are three types of system messages: common messages, real-time messages, and exclusive
messages. In the following, we will see how these messages are used.
Common Messages These messages are common to the complete system. These messages provide for
functions such as select a song, setting the song position pointer with number of beats, and sending a tune
request to an analog synthesizer.
System Real Time Messages
These messages are used for setting the system's real-time parameters. These parameters include the timing
clock, starting and stopping the sequencer, ressuming the sequencer from a stopped position, and resetting
the system.
System Exclusive messages
These messages contain manufacturer-specific data such as identification, serial number, model number, and
other information. Here, a standard file format is generated which can be moved across platforms and
applications.
30
IT-6501 Graphics and Multimedia Unit-4
Analog-to-Digital Converters: The ADC gets its input from the audio mixer and converts the amplitude of
a sampled analog signal to either an 8-bit or 16-bit digital value.
Digital-to-Analog Converter (DAC): A DAC converts digital input in the 'foml of W AVE files, MIDI
output and CD audio to analog output signals.
Sound Compression and Decompression: Most sound boards include a codec for sound compression and
decompression.
ADPCM for windows provides algorithms for sound compression.
CD-ROM Interface: The CD-ROM interface allows connecting u CD ROM drive.to the sound board.
31
IT-6501 Graphics and Multimedia Unit-4
32
IT-6501 Graphics and Multimedia Unit-4
Pixel Threshold: Setting pixel threshold levels set a limit on the bright or dark areas of a picture. Pixel
threshold setting is also achieved through the input lookup table.
Inter- frame image processing
Inter- frame image processing is the same as point-to-point image processing, except that the image
processor operates on two images at the same time. The equation of the image operations is as follows:
Pixel output (x, y) = (Image l(x, y)
Operator (Image 2(x, y)
Image Averaging: Image averaging minimizes or cancels the effects of random noise.
Image Subtraction: Image subtraction is used to determine the change from one frame to the next .for image
comparisons for key frame detection or motion detection.
Logical Image Operation: Logical image processing operations are useful for comparing image frames
and masking a block in an image frame.
Spatial Filter Processing The rate of change of shades of gray or colors is called spatial frequency. The
process of generating images with either low-spatial frequency-components or high frequency components
is called spatial filter processing. The following figure shows the one pixel calculation using a pixel map.
Low Pass Filter: A low pass filter causes blurring of the image and appears to cause a reduction in noise.
High Pass Filter: The high-pass filter causes edges to be emphasized. The high-pass filter attenuates low-
spatial frequency components, thereby enhancing edges and sharpening the image.
Laplacian Filter: This filter sharply attenuates low-spatial-frequency components without affecting and
33
IT-6501 Graphics and Multimedia Unit-4
Frame Processing Frame processing operations are most commonly for geometric operations, image
transformation, and image data compression and decompression Frame processing operations are very
compute intensive many multiply and add operations, similar to spatial filter convolution operations.
Image scaling: Image scaling allows enlarging or shrinking the whole or part of an image.
Image rotation: Image rotation allows the image to be rotated about a center point. The operation can be
used to rotate the image orthogonally to reorient the image if it was scanned incorrectly. The operation can
also be used for animation. The rotation formula is:
pixel output-(x, y) = pixel input (x, cos Q + y sin Q, - x sin Q + Y cos Q)
where, Q is the orientation angle
x, yare the spatial co-ordinates of the original pixel.
Image translation: Image translation allows the image to be moved up and down or side to side. Again, this
function can be used for animation.
The translation formula is:
Pixel output (x, y) =Pixel Input (x + Tx, y + Ty) where Tx and Ty are the horizontal and vertical coordinates.
x, yare the spatial coordinates of the original pixel. Image transformation: An image contains varying
degrees of brightness or colors defined by the spatial frequency. The image can be transformed from spatial
domain to the frequency domain by using frequency transform.
34
IT-6501 Graphics and Multimedia Unit-4
A video capture board can handle a variety of different audio and video input signals and convert them from
analog to digital or digital to analog.
Video Channel Multiplexer: It is similar to the video grabber's video channel multiplexer.
Video Compression and Decompression: A video compression and decompression processor is used to
compress and decompress video data.
The video compression and decompression processor contains multiple stages for compression and
decompression. The stages include forward discrete cosine transformation and inverse discrete cosine
transformation, quantization and inverse quantization, ZigZag and Zero run-length encoding and decoding,
and motion estimation and compensation.
Audio Compression: MPEG-2 uses adaptive pulse code modulation (ADPCM) to sample the audio signal.
The method takes a difference between the actual sample value and predicted sample value. The difference
is then encoded by a 4-bit value or 8-bit value depending upon the sample rate
Analog to Digital Converter: The ADC takes inputs from the video switch and converts the amplitude of a
sampled analog signal to either an 8-bit or 16-bit digital value.
Performance issues for full motion video:
During the capture the video hardware and software must be able to keep up with the output of the camera
to prevent loss of information. The requirements for playback are equally intense although there is no risk of
permanent loss of information. Consider the eg below,
35
IT-6501 Graphics and Multimedia Unit-4
36
IT-6501 Graphics and Multimedia Unit-4
It includes few more new commands, and vender-unique command sets for optical drives, tape drives,
scanners and so on. To make the bus wider, a system designer uses a second 68-pin connector in addition to
the standard 50 pin connector.
Magnetic Storage Densities and Latencies
The Latency is divided into two categories: seek latency and rotational latency. Data management provides
the command queuing mechanism to minimize latencies and also set-up the scatter-gather process to gather
scattered data in CPU main memory.
Seek Latencies: There are three seek latencies available. They are· overlapped seek latency, Mid-transfer
seek and Elevator seek.
Rotational Latencies: To reduce latency, we use two methods. They are:
(i) Zero latency read/write: Zero latency reads allow transferring data immediately after the head settles. It
does not wait for disk revolution to sector property.
(ii) Interleaving factor: It keeps up with the data stream without skipping seccors. It determines the
organization of sectors.
Transfer Rate and I/O per Second: I/O transfer nite varies from 1.2 M bytes/Sec. to 40 M bytes/Sec.
Transfer rate is defined as the rate at which data is transferred from the drive buffer to the host adapter
memory.
Data Management: It includes Command queueing and Scattergather. Command queueing allows
execution of multiple sequential commands with system CPU intervention.Scatter is a process of setting the
data for best fit in available block of memory or disk. Gather is a process which reassembles data into
contiguous blocks on memory or disk ..
Figure below shows the relationship between seek latency, Rotational latency and Data transfer.
It is a method of attaching multiple drives to a single host adapter. The data is written to the first drive first,
then after filling it, the controller, allow the data to write in second drive, and so on. Meantime Between
Failure (MTBF) = MTBF of single/drivel Total no. of drives.
37
IT-6501 Graphics and Multimedia Unit-4
RAID level 0 provides performance improvement. It is achieved by overlapping disk reads and writes.
Overlapping here means, while segment I is being written to drive 1, segment 2 writes can be initiated for
drive 2.
The actual performance achieved depends on the design of the controller and how it manages disk reads and
writes.
2.RAID Level 1 Disk Mirroring
The Disk mirroring causes two copies of every file to be written on two separate drives. (Data redundancy is
achieved).
These drives are connected to a single disk controller. It is useful in mainframe and networking systems.
Apart from that, if one drive fails, the other drive which has its copy can be used.
Performance: Writing
is slow.
Reading can be speeded up by overlapping seeks.
Read transfer rate and number ofI/O per second is better than a single drive. I/O
transfer rate (Bandwidth) = No. of drives x drive I/O transfer rate
I / Otransferr ate
No of I/O’s Per second = Averagesizeoftransfe r
controller
Segment Segment
0 0
Disk controller arrangement for RAID Level1
Uses:
Provide backup in the event of disk failures in file servers.
Another form of disk mirroring is Duplexing uses two separate controllers,this sectioned controller enhances
both fault tolerance and performance.
38
IT-6501 Graphics and Multimedia Unit-4
Host Adapter
.
organization of bit interleaving for RAID level2
It provides the ability to handle very large files, and a high level of integrity and reliability. It is good for
multimedia system. RAID Level 2 utilizes a hamming error correcting code to correct single-bit errors and
doublebit errors.
Drawbacks:
(i) It requires multiple drives for error correction (ii) It is an expensive approach to data redundancy. (iii) It
is slow.
Uses: It is used in multimedia system. Because we can store bulk of video and audio data.
5.RAID Level-4 Sector Interleaving: Sector interleaving means writing successive sectors of data on
different drives.
As in RAID 3, RAID 4 employs multiple data drives and typically a single dedicated parity drive. Unlike
RAID 3, where bits of data are Written to successive disk drives, an Ri\ID 4, the first sector of a block of
39
IT-6501 Graphics and Multimedia Unit-4
data is written to the first drive, the second sector of data is written to the secohd drive, and so on. The data
is interleaved at the data level.
RAID Leve1-4 offers cost-effective improvement in performance with data.
RAID Level-5 Block Interleaving: In RAID LevelS, as in all the other RAID systems, multiple drives are
connected to a disk array controller.
The disk array controller contains multiple SCSI channels.
A RAID 5 system can be designed with a single SCSI host adapter with multiple drives connected to the
single SCSI channel.
Unlike RAID Level-4, where the data is sector-interleaved, in RAID Level-5 the data is block-interleaved.
Host
Adapter
40
IT-6501 Graphics and Multimedia Unit-4
1. CD-ROM
Physical Construction of CD ROMs:
It consists of a polycarbonate disk. It has 15 mm spindle hole in the center. The polycarbonate substrate
contains lands and pits.
The space between two adjacent pits is called a land. Pits, represent binary zero, and the transition from land
to pits and from pits to land is represented by binary one.
The polycarbonate substrate is covered by reflective aluminium or aluminium alloy or gold to increase the
reflectivity of the recorded surface. The reflective surface is protected by a coat oflacquer to prevent
oxidation. A CD-ROM consists of a single track which starts at the center from inside and spirals outwards.
The data is encoded on this track in the form of lands and pits. A single track is divided into equal length
sectors and blocks. Each sector or block consists of2352 bytes, also called a frame. For Audio CD, the data
is indexed on addressed by hours, rninutes, seconds and frames. There are 75 frames in a second.
41
IT-6501 Graphics and Multimedia Unit-4
Magnetic Disk Organization: Magnetic disks are organized by CYlinder, track and sector. Magnetic hard
disks contain concentric circular tracks. They are divided into sector.
Component of rewritable phase change cd-rom
42
IT-6501 Graphics and Multimedia Unit-4
2. Mini-Disk
Mini-Disk for Data is known as MD-Data. It was developed by Sony Corporation. It is the data version of
the new rewritable storage format. It can be used in three formats to support all users.
A premastered optical disk.
A recordable magneto-optical disk.
A hybrid of mastered and recorded.
Its size is 2.5 inch. It provides large capacity. It is low cost. It is used in multimedia applications.A MD
demands as a replacement for audio cassette. A 2-1/2 inch MD-Data disk stores 140Mbytes of data and
transfer data at 150Kbytes/sec. the figure shows the format for MD-Data standard.
43
IT-6501 Graphics and Multimedia Unit-4
The optical disk of WORM consists of six layers, the first layer is a poly carbonate substrate.
The next three layer are multiple recording layers made from antimony-selenide(Sb2Se3)and
bismuth-tellurium(Bi2Te3).
The Bismuth-tellurium is sandwiched between antimony-selenide as shown in the figure.
The recording layer are covered by aluminium alloy or gold to increasethe reflectivity of recorded
surface.
The reflective surface is protected by a coat of lacquer to prevent oxidation.
Recording(writing) of information: During recording, the input Signal is fed to a laser diode. The laser
beam from the laser diode is modulated by the input signal. It switches the laser beam on and off. if the
beam is on, it strikes the three recording layers.
The beam is absorbed by the bismuth-tellurium layer. Heat is generated within the layer. This heat diffuses
the atoms in the three recording layers. It forms four-element alloy layer. Now, the layer becomes recorded
layers.
44
IT-6501 Graphics and Multimedia Unit-4
45
IT-6501 Graphics and Multimedia Unit-4
The laser beam gets reflected off the surface of the disk.
The weak magnetic field makes the laser beam polarized and the plane of the beeam is rotated
clockwise or counter clockiwise this phenomenon is called Kerr Effect.
The direction of the rotation for the beam dependson the polarity of the magnetic field.
46
IT-6501 Graphics and Multimedia Unit-4
47
IT-6501 Graphics and Multimedia Unit-4
A juke box may contain drives of different types, including WORM, rewritable, or multifunction. Juke
boxes contain one or more drives. A juke box is used for storing large volumes of multimedia information in
one cost effective store.
Juke box-based optical disk libraries can be networked so that multiple users can access the information.
Optical disk libraries serve as near-line storage for infrequently used data.
Hierarchical Storage Applications: Banks, insurance companies, hosptials, state and federal governments,
manufacturing companies and a variety of other business and service organizations need to permanently
store large volumes of their records, from simple documents to video information, for audit trail use.any
Cache designs use a high-water mark and a low-water mark to trigger cache management operations. When
the cache storage fills up to the high-water mark, the cache manager starts creating more space in cache
storage. Space is created by discarding objects.
The cache manager maintains a data base of objects in the cache. Cache areas containing updated
objects are frequently called dirty cache.
Objects in dirty cache are written back at predetermined time intervals or before discarding an
object.
Possible Questions
2 - MARKS
1. What do you mean by compression and decompression.
2. State the types of compression.
3. What is Huffman Encoding?
4. Write short notes on Packpits Encoding?
5. Write down about CCITT group 3-I-D compression.
6. What is the necessity of k factor in CCITT Group-3 2D compression.
7. What are the steps involved for pseudo code to code line in data formatting for CCITT
Group-3 2D?
8. State the advantages of CCITT Group-3 2D.
9. Write short notes on YUV representation.
10. What is Discrete Cosine Transform?
11. Write short notes on Quantization.
12. What are the color characteristics.
13. What is predictive lossless encoding for?
14. What is the role of entropy encoding in data compression.
15. What is Macroblock in MPEG?
16. Define the term "Motion Compensation".
17. Write short notes on MPEG encoder.
18. What do you mean by vector quantization in MPEG standard.
19. What do you mean by audio compression.
20. Write short notes on fractal compression.
21. List the key formats available in Rich Text Format.
22. What is TIFF File Format?
23. What is TIFF tag ID?
24. What is full motion video?
25. What are the advantages of dye sublimation printer.
26. List the features of scanner.
27. What is ADC stands for?
28. Write short notes on histogram sliding. What is dithering?
29. What do you mean by RIFF-chunk
30. Define LIST chunk
31. Describe TWAIN architecture
32. How do scanner work.
33. Give some of the visual display technology standards
48
IT-6501 Graphics and Multimedia Unit-4
34. State the four important factors that governs ADC process.
35. State the three types of Voice Recognition system.
36. Write shor1 notes on MIDI.
37. What is digital camera? State its advantages.
38. What do you mean by Spatial Filter Processing.
39. Write short notes on disk spanning.
40. What is RAID?
41. State the uses of magnetic storage in multimedia.
42. Give brief notes on CD-ROM.
43. What are the three formats of minidisk?
44. What is juke box. Give other name used for juke box.
45. What are the four types of storage in cache organization for hierarchical storage systems.
16 Marks
1. Explain briefly about binary image compression schemes. (16)
2.(a) Explain the characteristics of color in detail. (10)
(b) State the relationship between frequency and wave length! in measuring radian energy.(6)
3.Write about JPEG in detail. (16)
4.(a) Explain DCT in detail. (12)
(b) Write short notes on zig zag sequence. (4)
5.State the requirements for full motion video compression in detail. (16)
6.What is MPEG? Discuss it in detail. (16)
7.Explain all data and file format standards in detail. (16)
8.Give a detailed description about voice recognition
(b) What is DIB stands for? (4)
9.Explain the different types of messages that are used with M1DIcommunication protocol. (16)
10.Explain the TWAIN architecture with neat diagram. (16)
11.Explain some ofthe video and image display system technique in detail. (16)
12.Give short notes for the following standards: (i) MDA, (3), (ii) CGA (3), (iii) MGA (3); (iv) VGA (4),
(v) XGA (3).
14. Explain in detail about voice recognition system. (12)
(b Write down the formula for voice recognition accuracy and substitution error.
16. a) Describe hierrarchical storage management in detail.
(b) What is migration? Write short notes about it.
17. (i) What is Optical Disk Library? Explain it.
(ii) Discuss about Cache Management for storage system.
49