Module 5
Module 5
INTRODUCTION TO
MANAGEMENT
INFORMATION SYSTEMS
Module 5 – Foundations of Business Intelligence:
Databases and Information
Management
Module Objectives
File Organization Term and Concepts
The Database Approach to Data Management
• Relational Databases
• Non-Relational Databases
Using Databases to Improve Business Performance and Decision
Making
Managing Data Resources
2
File Organization Terms and
Concepts
Database: Group of related files
File: Group of records of same type
Record: Group of related fields
Field: Group of characters as word(s) or number(s)
Entity: Person, place, thing on which we store information
Attribute: Each characteristic, or quality, describing entity
3
File Organization Terms and
Concepts - Cont.
Data Hierarchy 4
File Organization Terms and
Concepts - Cont.
5
File Organization Terms and
Concepts - Cont.
Problems with Traditional File System
• Files maintained separately by different departments
• Data redundancy
• Data inconsistency
• Program-data dependence
• Lack of flexibility
• Poor security
• Lack of data sharing and availability
6
File Organization Terms and
Concepts - Cont.
7
File Organization Terms and
Concepts - Cont.
Consider a Traditional File System used by a company as shown
in the following diagram within the following units.
What types of data would be replicated and used by the
applications within each of these functional units, using a
Traditional File System?
8
File Organization Terms and
Concepts - Cont.
Traditional File 9
The Database Approach to
Data Management
Eliminates many of the problems with traditional file
organization.
A Database is an organized collection of structured information,
or data, typically stored electronically in a computer system,
and that allows for data to be easily accessed, managed, and
updated.
Databases provide a central repository for data, thereby
controlling and mitigating main issues such as data redundancy
and inconsistency
10
Database Management
Systems (DBMS)
Database
• Serves many applications by centralizing data and
controlling redundant data
Database management system (DBMS)
• Interfaces between applications and physical data files
• Separates logical and physical views of data
• Solves problems of traditional file environment
o Controls redundancy
o Eliminates inconsistency
o Uncouples programs and data
o Enables organization to centrally manage data and data security 11
Database Management
Systems (DBMS) – Cont.
12
Relational DBMS
Represent data as two-dimensional tables
Each table contains data on entity and attributes
Table: grid of columns and rows
• Rows (tuples): Records for different entities
• Fields (columns): Represents attribute for entity
• Key field: Field used to uniquely identify each record
• Primary key: Field in table used for key fields
• Foreign key: Primary key used in second table as lookup field to
identify records from original table
13
Relational DBMS – Cont.
14
Common Types of Relational
DBMS
DBMS Platform
Microsoft Access Desktop Systems
DB2 Relational DBMS for large mainframes,
Oracle servers and midrange computers
SQL Server
MySQL Popular Open Source DBMS
Oracle Database Lite Mobile Computing Devices
SQLite
15
Operations of a Relational DBMS
Three basic operations used to develop useful sets of data
SELECT
o Creates subset of data of all records that meet stated criteria
JOIN
o Combines relational tables to provide user with more information
than available in individual tables
PROJECT
o Creates subset of columns in table, creating tables with only the
information specified
16
Operations of a Relational
DBMS – Cont.
17
Capabilities of Database Management
Systems
Data definition capability
Data dictionary
Querying and reporting
• Data manipulation language
o Structured Query Language (SQL)
Many D B M Shave report generation capabilities for creating
polished reports (Microsoft Access)
18
Designing Databases
Conceptual design vs. physical design
Normalization
• Streamlining complex groupings of data to minimize redundant
data elements and awkward many-to-many relationships
Referential integrity
• Rules used by RDBMS to ensure relationships between tables
remain consistent
Entity-relationship diagram
A correct data model is essential for a system serving the
business well 19
Non-Relational Databases and
Databases in the Cloud
Non-relational databases: “No SQL”
• More flexible data model
• Data sets stored across distributed machines
• Easier to scale
• Handle large volumes of unstructured and structured data
Databases in the cloud
• Appeal to start-ups, smaller businesses
• Amazon Relational Database Service, Microsoft SQL Azure, MongoDB
• Private clouds
20
Cloud Databases
Typically, less functionality than on premise DBs
Cloud computing vendors provide relational database services
as well.
Amazon Relational Database Services (Amazon RDS)
Oracle Database Cloud Service
Microsoft SQL Azure Database
21
Capabilities of DBMS
Data Definition Language
Data Dictionary
Data Manipulation Language
Data Control Language
Report Generation
22
The Challenge of Big Data
Big data
• Massive sets of unstructured/semi-structured data from web
traffic, social media, sensors, and so on
Volumes too great for typical DBMS
• Petabytes, exabytes of data
Can reveal more patterns, relationships and anomalies
Requires new tools and technologies to manage and analyze
23
Business Intelligence
Infrastructure
Array of tools for obtaining information from separate systems
and from big data
Data warehouse
• Stores current and historical data from many core operational
transaction systems
• Consolidates and standardizes information for use across
enterprise, but data cannot be altered
• Provides analysis and reporting tools
24
Business Intelligence
Infrastructure – Cont.
25
Components of a Data
Business Intelligence
Infrastructure – Cont.
Data marts
• Subset of data warehouse
• Typically focus on single subject or line of business
Hadoop
• Enables distributed parallel processing of big data across inexpensive
computers
• Key services
o Hadoop Distributed File System (HDFS): data storage
o MapReduce: breaks data into clusters for work
o Hbase: No SQL database
Used Yahoo, NextBio 26
Business Intelligence
Infrastructure – Cont.
In-memory computing
• Used in big data analysis
• Uses computers main memory (R A M) for data storage to avoid
delays in retrieving data from disk storage
• Can reduce hours/days of processing to seconds
• Requires optimized hardware
Analytic platforms
High-speed platforms using both relational and nonrelational
tools optimized for large datasets
27
Analytical Tools for
Analyzing Big Data
Tools for consolidating, analyzing, and providing access to vast
amounts of data to help users make better business decisions
• Multidimensional data analysis (OLAP)
• Data mining
• Text mining
• Web mining
28
Analytical Tools for
Analyzing Big Data – Cont.
29
Analytical Tools for
Analyzing Big Data – Cont.
Online Analytical Processing (OLAP)
Supports multidimensional data analysis
• Viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost, region, time
period) is different dimension
• Example: How many washers sold in the East in June compared
with other regions?
OLAP enables rapid, online answers to ad hoc queries
30
Analytical Tools for
Analyzing Big Data – Cont.
31
Analytical Tools for
Analyzing Big Data – Cont.
Data Mining
Finds hidden patterns, relationships in datasets
• Example: customer buying patterns
Infers rules to predict future behavior
Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
32
• Forecasting
Analytical Tools for Analyzing
Big Data – Cont.
Text Mining
Extracts key elements from large unstructured datasets
Sentiment analysis software
Used with unstructured data in the form of text files.
Text files believed to account for over 80 percent of useful
organizational information
Text files are a major source of big data that firms want to
analyze.
33
Analytical Tools for
Analyzing Big Data – Cont.
Web Mining
Discovery and analysis of useful patterns and information from
web
• Example to understand customer behavior, evaluate
effectiveness of Web site, etc.
Web content mining
Web structure mining
Web usage mining
34
Databases and the Web
Many companies use the web to make some internal databases
available to customers or partners
Typical configuration includes:
• Web server
• Application server/middleware/CGI scripts
• Database server (hosting DBMS)
Advantages of using the web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
35
• Inexpensive to add web interface to system
Databases and the Web –
Cont.
36
Establishing an Information
Policy
Firm’s rules, procedures, roles for sharing, managing,
standardizing data
Data administration
• Establishes policies and procedures to manage data
Data governance
• Deals with policies and processes for managing availability,
usability, integrity, and security of data, especially regarding
government regulations
Database administration
• Creating and maintaining database 37
Ensuring Data Quality
More than 25 percent of critical data in Fortune 1000 company
databases are inaccurate or incomplete
Before new database is in place, a firm must:
• Identify and correct faulty data
• Establish better routines for editing data once database in
operation
Data quality audit
Data cleansing
38
References
Laudon & Laudon (2016): Management Information Systems :
Managing the Digital Firm. Chapter 6
39
Thank you
Kerry-Ann Xavier
[email protected]