0% found this document useful (1 vote)

275 views

Data Compression and Huffman Algorithm

Data compression algorithms aim to reduce file sizes by eliminating redundant data. The Huffman algorithm assigns variable-length codes to characters based on their frequency, allowing more common characters to be encoded with fewer bits. Run-length encoding replaces repeated characters with a code indicating the character and number of repeats. Lossy techniques like JPEG discard insignificant data to achieve higher compression ratios, while lossless methods like LZW and run-length encoding allow exact reconstruction. Data compression is important for reducing storage and transmission requirements.

Uploaded by

ammayi9845_930467904

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

275 views

Data Compression and Huffman Algorithm

Uploaded by

ammayi9845_930467904

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 18

DATA COMPRESSION AND HUFFMAN ALGORITHM

Technical Seminar Paper Submitted by

Presented by
Vineet Agarwala IT200118155

Technical Seminar Under the guidance of Anisur Rahman

NATIONAL INSTITUTE OF SCIENCE & TECHNOLOGY

DATA COMPRESSION
Virtually all forms of data - text, numerical, image, video contain redundant elements Data can be compressed by eliminating the redundant elements. A code is substituted for the eliminated redundant element, where the code is shorter than eliminated element. When compressed data is retrieved from storage or received over a communications link, it is expanded back to its original form, based on the code. Compression is used: to save storage space to reduce communications transmission requirements The art or science of compactly representing information Digital realm: using lesser number of bits to represent information Data + Compression = information redundancy

REDUNDANCY
Most types of computer files are fairly redundant -- they have the same information listed over and over again. File-compression programs

simply get rid of the redundancy

Ask not what your country can do for you -- ask what you can do for your country. Ignoring the difference between capital and lower-case letters, roughly half of the phrase is redundant. Nine words - ask, not, what, your, country, can, do, for, you -- give us almost everything we need for the entire quote

Compression Techniques
Lossless
Data can be completely recovered after decompression Recovered data is identical to original Exploits redundancy in data

Lossy
Data cannot be completely recovered after decompression Some information is lost for ever Gives more compression than lossless Discards insignificant data components

IMAGE COMPRESSION
Image compression can be lossy or lossless Methods for lossless image compression are:
Run-length encoding Entropy coding Adaptive dictionary algorithms such as LZW

Methods for lossy compression are:

Reducing the color space to the most common colors in the image. The selected colors are specified in the color palette in the header of the compressed image. Each pixel just references the index of a color in the color palette. This method can be combined with dithering to blur the color borders. Transform coding. This is the most commonly used method. A Fourier-related transform such as DCT or the wavelet transform are applied, followed by quantization and entropy coding. Fractal compression.

JPEG (TRANSFORM COMPRESSION)

JPEG is named after its origin, the Joint Photographers Experts Group This involves reducing the number of bits per sample or entirely discard some of the samples

MULTIMEDIA COMPRESSION
Multimedia compression is a general term referring to the compression of any type of multimedia, most notably graphics, audio, and video MPEG (Moving Pictures Experts Group ) The future of this technology is to encode the compression and uncompression algorithms directly into integrated circuits. The approach used by MPEG can be divided into two types of compression: within-the-frame and between-frame

DATA COMPRESSION ALGORITHMS

LOSSY COMPRESSION
Run Length Encoding Huffman Coding

LOSS LESS COMPRESSION

CS & Q JPEG MPEG

Delta
LZW

RUN-LENGTH ENCODING
Data files frequently contain the same character repeated many times in a row.

Example of run-length encoding. Each run of zeros is replaced by two characters in the compressed file: a zero to indicate that compression is occurring, followed by the number of zeros in the run.

HUFFMAN ENCODING
This method is named after D.A. Huffman, who developed the procedure in the 1950s. More than 96% of this file consists of only 31 characters out of 127

HUFFMAN ENCODING EXAMPLE

Character frequencies
A: 20% (.20) B: 9% (.09) C: 15% D: 11% E: 40% F: 5%

E .4

BF .14

D .15

A .20

C .15

0
B .09

1
F .05

HUFFMAN ENCODING EXAMPLE (CONDT.)

Codes
A: 010 B: 0000 C: 011 D: 001 E: 1 F: 0001 0
B .09

ABCDEF 1.0

0
0
BF .14 BFD .25

ABCDF .6

1
0
A .20 AC .35

E .4

1
D .15

1
C .15

1
F .05

Run Length Encoding

CTAAAAAGGGTCGTTTTTTGCCCGGGGGCCTCCCCCCC

CTAAAAAGGGTCGTTTTTTGCCCGGGGGCCTCCCCCCC CTAAAAAGGGTCGTTTTTTGCCCGGGGGCCTCCCCCCC

CT5A3GTCG6TG3C5GCCT7C } Run length encoded: 21 symbols

Run Length Encoding (Contd.)

WWWBWWWWWBWWWBWWWWBWWWWWBWWWBWW WWWBWWBWWWWWWBBBWWWWWWWBWBWWWWW WWBWWBBWWWWWBWWWWBWWWWBWWWWB

WWWBWWWWWBWWWBWWWWB.

3WB5WB3WB4WB.
3151314 #W3151314..
possible optimization, but

Optimization requires escape character

Run Length Encoding (Contd.)

Is run length encoding practical for images?
No Yes

Chances of three or more identical consecutive pixels are low for most real images. Especially images with large color depth.

Some images do have lots of consecutive pixels. Especially images with low color depth. RLE is used for fax machines, and by BMP, TIFF and PCX files.

LZW Compression
LZW compression is named after its developers, A. Lempel and J. Ziv, with later modifications by Terry A. Welch. It is the foremost technique for general purpose data compression due to its simplicity and versatility

LZW Compression (contd.)

LZW compression flowchart. The variable, CHAR, is a single byte. The variable, STRING, is a variable length sequence of bytes. Data are read from the input file (box 1 & 2) as single bytes, and written to the compressed file (box 4) as 12 bit codes.

CONCLUSION
Is it possible to create a data compression algorithm that will always compress data? Is there an optimal data compression algorithm?
Lossless: No, compression rates depend on the data. Lossy: No, the quality of compression is subjective.

Is Data Compression is really that important?

MySQL HeatWave Implementation Associate
No ratings yet
MySQL HeatWave Implementation Associate
55 pages
Moral Relativism
No ratings yet
Moral Relativism
5 pages
Effectiveness of School District Antibullying Policies in Improving LGBT Youths' School Climate
No ratings yet
Effectiveness of School District Antibullying Policies in Improving LGBT Youths' School Climate
9 pages
Applied Linguistics - Process and Purpose of Reading
No ratings yet
Applied Linguistics - Process and Purpose of Reading
13 pages
Hatley Pirbhai Template: Lecture 2 July 6, 2011
No ratings yet
Hatley Pirbhai Template: Lecture 2 July 6, 2011
14 pages
Computer Science Extended Essay
No ratings yet
Computer Science Extended Essay
15 pages
Communication Theory Lecture Notes PDF
0% (3)
Communication Theory Lecture Notes PDF
2 pages
Unit 5 - Data Compression
No ratings yet
Unit 5 - Data Compression
46 pages
DIGITAL IMAGE PROCESSING Full Report
No ratings yet
DIGITAL IMAGE PROCESSING Full Report
10 pages
Developing Requirements: Timing: 50 Minutes
No ratings yet
Developing Requirements: Timing: 50 Minutes
62 pages
Sl. No - Alc No Programme Name Category
No ratings yet
Sl. No - Alc No Programme Name Category
34 pages
Health Effects of Rancid Oils
No ratings yet
Health Effects of Rancid Oils
3 pages
6 TCP Congestion Control
No ratings yet
6 TCP Congestion Control
14 pages
ch05 2
No ratings yet
ch05 2
31 pages
03 - Lecture - Chapter 2 - Pixel and Their Relationships
No ratings yet
03 - Lecture - Chapter 2 - Pixel and Their Relationships
42 pages
Inflation Without A Beginning: A Null Boundary Proposal, by Anthony Aguirre
No ratings yet
Inflation Without A Beginning: A Null Boundary Proposal, by Anthony Aguirre
18 pages
Final Lab Exam
No ratings yet
Final Lab Exam
13 pages
Project Work
No ratings yet
Project Work
21 pages
CS 251 Fall 2018 Final Exam
No ratings yet
CS 251 Fall 2018 Final Exam
15 pages
BCAC-301 - Lecture 1
No ratings yet
BCAC-301 - Lecture 1
18 pages
Image Compression: CS474/674 - Prof. Bebis
100% (1)
Image Compression: CS474/674 - Prof. Bebis
110 pages
Creative Problem Solving
100% (1)
Creative Problem Solving
3 pages
K L University Freshman Engineering Department: A Project Based Lab Report On Petya and Staircases
No ratings yet
K L University Freshman Engineering Department: A Project Based Lab Report On Petya and Staircases
16 pages
Projects 2018-19
33% (6)
Projects 2018-19
27 pages
Sample Questions - Infosys
No ratings yet
Sample Questions - Infosys
8 pages
Broadcasting Chat Server
83% (6)
Broadcasting Chat Server
25 pages
Epidemiology, Pathophysiology and Symptomatic Treatment of Sciatica: A Review
100% (1)
Epidemiology, Pathophysiology and Symptomatic Treatment of Sciatica: A Review
12 pages
Verilog Sequential Modeling
No ratings yet
Verilog Sequential Modeling
8 pages
6 Image Compression
No ratings yet
6 Image Compression
45 pages
NT Kernel Internals PDF
No ratings yet
NT Kernel Internals PDF
106 pages
Practical Semantic Web and Linked Data App
No ratings yet
Practical Semantic Web and Linked Data App
180 pages
Lecture 02 Part A - Uninformed or Blind Search
No ratings yet
Lecture 02 Part A - Uninformed or Blind Search
92 pages
Pressman SEPA 9e Ch003 PPT
No ratings yet
Pressman SEPA 9e Ch003 PPT
16 pages
Formula Sheet: Based On Dave Tompkins's Awesome CPSC 121 Handout
No ratings yet
Formula Sheet: Based On Dave Tompkins's Awesome CPSC 121 Handout
3 pages
Mass Storage Structure
100% (1)
Mass Storage Structure
35 pages
EC2037 MUTIMEDIa QB
100% (4)
EC2037 MUTIMEDIa QB
16 pages
Norms With Feeling
No ratings yet
Norms With Feeling
29 pages
Unit 4 DATA PLACEMENT ON DISKS
No ratings yet
Unit 4 DATA PLACEMENT ON DISKS
23 pages
EECS 370 Final Review
No ratings yet
EECS 370 Final Review
16 pages
Semantic Web: Seminar Report
No ratings yet
Semantic Web: Seminar Report
13 pages
Huffman Coding Paper
No ratings yet
Huffman Coding Paper
3 pages
National University of Singapore Faculty of Engineering
No ratings yet
National University of Singapore Faculty of Engineering
8 pages
CNS Lab Manual
No ratings yet
CNS Lab Manual
25 pages
Software Engineering: A Practitioner's Approach 9th Edition Roger S. Pressman download pdf
100% (1)
Software Engineering: A Practitioner's Approach 9th Edition Roger S. Pressman download pdf
36 pages
UNIT2
No ratings yet
UNIT2
25 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Csizg514 Mar08 An PDF
No ratings yet
Csizg514 Mar08 An PDF
1 page
Sample - Project Abstract - Outline Report - Course No. - BITS ID Edited
100% (1)
Sample - Project Abstract - Outline Report - Course No. - BITS ID Edited
10 pages
Software Testing Lab Manual 3
No ratings yet
Software Testing Lab Manual 3
50 pages
M.sc.CS Sem I NEP 2020 Software Defined Networking
No ratings yet
M.sc.CS Sem I NEP 2020 Software Defined Networking
151 pages
Infosys Online Test - Samples
No ratings yet
Infosys Online Test - Samples
4 pages
18CS36 - DMS - VTU Question Paper - Jan 2020
No ratings yet
18CS36 - DMS - VTU Question Paper - Jan 2020
3 pages
Common People Issues in Testing
33% (3)
Common People Issues in Testing
7 pages
Cannon Strassen DNS Algorithm
No ratings yet
Cannon Strassen DNS Algorithm
10 pages
Quiz2 3510 Cheat-Sheet
100% (1)
Quiz2 3510 Cheat-Sheet
4 pages
Synopsis 1
No ratings yet
Synopsis 1
12 pages
unit 5 data compression
No ratings yet
unit 5 data compression
98 pages
Literature Survey
No ratings yet
Literature Survey
5 pages
Data Compression Report
No ratings yet
Data Compression Report
10 pages
Chapter 3 Multimedia Data Compression
100% (2)
Chapter 3 Multimedia Data Compression
23 pages
MM Unit-III - 0
No ratings yet
MM Unit-III - 0
22 pages
Continued: Detection of A/D Convertion
No ratings yet
Continued: Detection of A/D Convertion
2 pages
Agenda: Phytomonitoring System Recommended Setup Block Diagram Work Done Future Work Plan Reference
No ratings yet
Agenda: Phytomonitoring System Recommended Setup Block Diagram Work Done Future Work Plan Reference
3 pages
Superscripts: 2 2x 2x 2x Subscripts: X X X X
No ratings yet
Superscripts: 2 2x 2x 2x Subscripts: X X X X
1 page
System Structure
No ratings yet
System Structure
3 pages
Simcom 900 GSM Module: GSM/GPRS Modem Fetures
No ratings yet
Simcom 900 GSM Module: GSM/GPRS Modem Fetures
3 pages
Data Acquisition - Why 3G?
No ratings yet
Data Acquisition - Why 3G?
3 pages
February: Interfacing Multiple Sensors Using Wireless Network and Using Visual Software Display On Monitor
No ratings yet
February: Interfacing Multiple Sensors Using Wireless Network and Using Visual Software Display On Monitor
3 pages
Future Work Plan: Month Tasks
No ratings yet
Future Work Plan: Month Tasks
2 pages
Work Plan: Month Plans
No ratings yet
Work Plan: Month Plans
3 pages
Work Flow Chart
No ratings yet
Work Flow Chart
3 pages
Work Done Till First Review
No ratings yet
Work Done Till First Review
3 pages
Design Requirements
No ratings yet
Design Requirements
2 pages
LCD Interfacing With lpc2148: Header File and Variable Declaration Initialization Function
No ratings yet
LCD Interfacing With lpc2148: Header File and Variable Declaration Initialization Function
3 pages
Phytomonitoring System: Data Colection Software Sensors
No ratings yet
Phytomonitoring System: Data Colection Software Sensors
3 pages
Design Requirements
No ratings yet
Design Requirements
5 pages
Figure 1: A Block Diagram of A Basic Filter
No ratings yet
Figure 1: A Block Diagram of A Basic Filter
10 pages
Swami Proj. Proposal
No ratings yet
Swami Proj. Proposal
1 page
New Proj - Synopsis
No ratings yet
New Proj - Synopsis
2 pages
Declaration Certificate Acknowledgement List of Figures List of Tables Acronyms
No ratings yet
Declaration Certificate Acknowledgement List of Figures List of Tables Acronyms
2 pages
Adaptive Noise Cancellation - New
No ratings yet
Adaptive Noise Cancellation - New
21 pages
List of Figures Fig. No. Title of The Figure Page No
No ratings yet
List of Figures Fig. No. Title of The Figure Page No
2 pages
Improving The Effectiveness of The Median Filter: Kwame Osei Boateng, Benjamin Weyori Asubam and David Sanka Laar
No ratings yet
Improving The Effectiveness of The Median Filter: Kwame Osei Boateng, Benjamin Weyori Asubam and David Sanka Laar
13 pages
List of Tables: Table No. Name of Table Page No
No ratings yet
List of Tables: Table No. Name of Table Page No
1 page
Error Detection and Correction
100% (3)
Error Detection and Correction
36 pages
For Sync Class - Module 7 - 3TSY2223
No ratings yet
For Sync Class - Module 7 - 3TSY2223
36 pages
Assignment ITT400 Answer
No ratings yet
Assignment ITT400 Answer
7 pages
Adaptive Huffman Coding
No ratings yet
Adaptive Huffman Coding
9 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
43 pages
Context-Based Adaptive Arithmetic Coding
No ratings yet
Context-Based Adaptive Arithmetic Coding
13 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
3 pages
lec 25
No ratings yet
lec 25
13 pages
ECEN 5682 Theory and Practice of Error Control Codes: Introduction To Block Codes
No ratings yet
ECEN 5682 Theory and Practice of Error Control Codes: Introduction To Block Codes
61 pages
Instant Download (Ebook) Image and video compression for multimedia engineering: fundamentals, algorithms, and standards by Shi, Yun Q.; Sun, Huifang ISBN 9781138299597, 1138299596 PDF All Chapters
100% (7)
Instant Download (Ebook) Image and video compression for multimedia engineering: fundamentals, algorithms, and standards by Shi, Yun Q.; Sun, Huifang ISBN 9781138299597, 1138299596 PDF All Chapters
55 pages
Ascii: Kode Standar Amerika Untuk Pertukaran Informasi Atau ASCII (American
No ratings yet
Ascii: Kode Standar Amerika Untuk Pertukaran Informasi Atau ASCII (American
7 pages
Extended Ascii Code PDF
No ratings yet
Extended Ascii Code PDF
2 pages
Iterative Error Correction Turbo Low Density Parity Check and Repeat Accumulate Codes 1st Edition Sarah J. Johnson - Download the complete ebook in PDF format and read freely
100% (1)
Iterative Error Correction Turbo Low Density Parity Check and Repeat Accumulate Codes 1st Edition Sarah J. Johnson - Download the complete ebook in PDF format and read freely
48 pages
Assign 1
No ratings yet
Assign 1
3 pages
Analytics 2022 12 17 021848
No ratings yet
Analytics 2022 12 17 021848
106 pages
Arithmetic Lempel and Ziv Coding Chapter 2 Part 2 EH
No ratings yet
Arithmetic Lempel and Ziv Coding Chapter 2 Part 2 EH
23 pages
Coding Systems - ASCII and Unicode
No ratings yet
Coding Systems - ASCII and Unicode
23 pages
Cyclic Codes. Detailed Solutions to Problems
No ratings yet
Cyclic Codes. Detailed Solutions to Problems
12 pages
ASCII Character Codes CheatSheet
No ratings yet
ASCII Character Codes CheatSheet
3 pages
Moonlight 조원상 솔로 (4현)
No ratings yet
Moonlight 조원상 솔로 (4현)
1 page
Error Control Discussion
No ratings yet
Error Control Discussion
3 pages
Channel Coding - Part II: Digital Communications
No ratings yet
Channel Coding - Part II: Digital Communications
26 pages
Data Compression MCQ
No ratings yet
Data Compression MCQ
45 pages
CHAPTER 6
No ratings yet
CHAPTER 6
5 pages
WCN-PBL-Codes (1)
No ratings yet
WCN-PBL-Codes (1)
17 pages
Notes For Turbo Codes
No ratings yet
Notes For Turbo Codes
15 pages
Trace
No ratings yet
Trace
57 pages
CP1252 (Windows 1252 Encoding) Info
100% (1)
CP1252 (Windows 1252 Encoding) Info
7 pages
Wipro Solutions
100% (3)
Wipro Solutions
14 pages

Data Compression and Huffman Algorithm

Uploaded by

Data Compression and Huffman Algorithm

Uploaded by

DATA COMPRESSION AND HUFFMAN ALGORITHM

Technical Seminar Paper Submitted by

Technical Seminar Under the guidance of Anisur Rahman

NATIONAL INSTITUTE OF SCIENCE & TECHNOLOGY

simply get rid of the redundancy

Methods for lossy compression are:

JPEG (TRANSFORM COMPRESSION)

DATA COMPRESSION ALGORITHMS

LOSS LESS COMPRESSION

HUFFMAN ENCODING EXAMPLE

HUFFMAN ENCODING EXAMPLE (CONDT.)

Run Length Encoding

CT5A3GTCG6TG3C5GCCT7C } Run length encoded: 21 symbols

Run Length Encoding (Contd.)

Optimization requires escape character

Run Length Encoding (Contd.)

LZW Compression (contd.)

Is Data Compression is really that important?

You might also like