11/04/16
DATA
WAREHOUSING
Basics
Concepts
By- Abhijeet Sakhare
11/04/16
Agenda
Evolution of DWH
Why should we consider Data Warehousing solutions ?
Definition of Data Warehouse
Characteristics of DWH
Difference between DWs and OLTP
DWH Life Cycle
DWH Architecture
ODS Vs. Warehouse
Sources of Data Warehouse Data
Appropriate uses of Data Warehouse Data
Inappropriate uses of Data Warehouse Data
Levels of Granularity of Data Warehouse Data
Options for viewing Data
Next step in Data Warehouse Evolution
By-Abhijeet Sakhare
11/04/16
By-Abhijeet Sakhare
Evolution of DWH
Traditional approaches to computer system design
during 1980s
Not optimized for analysis and reporting
Company wide reporting couldnt be supported from a
single system
For developing reports often required writing specific
computer programs which was slow and expensive
Why should we consider Data
Warehousing solutions ?
When users are requesting access to a large amount of
historical information for reporting purposes, you should
strongly consider a warehouse or mart. The user will benefit
when the information is organized in an efficient manner for
this type of access.
11/04/16
By-Abhijeet Sakhare
Def . Data Warehousing
DWH is type of relational data base system specially
designed for query analysis processing rather than
transactional processing.
The DWH systems are also called as Historical Dbs, Read
only Dbs, Integrated Dbs, Decision Supporting System,
Executive info System, Business Info System.
11/04/16
Characteristics of DWH
Subject Oriented
Non Volatile
Integrated
Time Variant
By-Abhijeet Sakhare
11/04/16
By-Abhijeet Sakhare
Differences..
DWH database (OLAP)
OLTP database
Designed for analysis of business
measures by category and
attributes.
Designed for real time business
operations.
Optimized for bulk loads and large,
complex, unpredictable queries
that access many rows per
table.
Optimized for a common set of
transactions, usually adding or
retrieving a single row at a time
per table.
Loaded with consistent, valid data;
requires no real time validation.
Optimized for validation of
incoming data during
transactions; uses validation
data tables.
Supports few concurrent users
relative to OLTP.
Supports thousands of concurrent
users.
11/04/16
By-Abhijeet Sakhare
OLAP Database (OLAP)
OLTP Database
Multidimensional Database
Structures
Normalized Data
Structures
Index - Many
Index - Few
Joins - Few
Joins - Many
Aggregated Data - More
Aggregate Data - Few
No. of users - Few
No. of users - More
Periodic update of data
Data Modification
More
Huge volumes of data
Small volumes of data
11/04/16
By-Abhijeet Sakhare
DWH Life Cycle
Business Analyst
Data Modular
ETL Developer
Report Developer
Testing
11/04/16
By-Abhijeet Sakhare
DWH Architecture
Three common architectures are:
DWH Architecture (Basic)
DWH Architecture (With a staging area)
DWH Architecture (With a staging area and data marts)
11/04/16
By-Abhijeet Sakhare
DWH Architecture (Basic)
11/04/16
By-Abhijeet Sakhare
DWH Architecture (with a staging area)
11/04/16
By-Abhijeet Sakhare
DWH Architecture
(with a staging area and data marts)
11/04/16
By-Abhijeet Sakhare
ODS Vs. Data Warehouse
Operational Data Store
Characteristics:
Data Focused Integration
From Transaction Processing
Focused Systems
Age Of The Data:
Current, Near Term
(Today, Last Weeks)
Primary Use:
Day-To-Day Decisions
Tactical Reporting
Current Operational Results
Twice Daily , Daily, Weekly
Frequency Of Load:
Data Warehouse
Subject Oriented
Integrated
Non-Volatile
Time Variant
Historic
(Last Month, Qtrly, Five
Years)
Long-Term Decisions
Strategic Reporting
Trend Detection
Weekly, Monthly, Quarterly
11/04/16
By-Abhijeet Sakhare
Sources of Data Warehouse Data
Archives
(Historic Data)
Current Systems
of Record
(Recent History)
Enterprise
Data Warehouse
Operational
Transactions
(Future Data Source)
11/04/16
By-Abhijeet Sakhare
Appropriate Uses of Data Warehouse Data
Produce Reports For Long Term Trend Analysis
Produce Reports Aggregating Enterprise Data
Produce Reports of Multiple Dimensions
(Earned revenue by month by product by branch)
11/04/16
By-Abhijeet Sakhare
Inappropriate Uses of Data Warehouse Data
Replace Operational Systems
Replace Operational Systems Reports
Analyze Current Operational Results
11/04/16
By-Abhijeet Sakhare
Levels of Granularity of Data Warehouse Dat
Atomic (Transaction)
Lightly Summarized
Highly Summarized
11/04/16
By-Abhijeet Sakhare
Options for Viewing Data
Text
90
80
70
60
50
40
30
20
10
0
1s t
Qtr
2nd
Qtr
3 rd
Qtr
4th
Qtr
11/04/16
By-Abhijeet Sakhare
Next Steps In Data Warehouse Evolution
Use It - Analyze Data Warehouse Data
Determine Additional Data Requirements
Define Sources For Additional Data
Add New Data (Subject Areas) to
Data Warehouse
11/04/16
By-Abhijeet Sakhare
Thank You !!!