Bca Ctis Sem-5 Introduction To Data Science
Bca Ctis Sem-5 Introduction To Data Science
Course Dossier
Semester: V
Subject Name: INTRODUCTION TO DATA SCIENCE
Subject Code: 14110503
SWARRNIM STARTUP & INNOVATION UNIVERSITY
Swarrnim School of Business (BCA -CTIS)
Open Elective-I
Introduction to Data Science
Semester: V
Code: ________
Th Pr Th Pr
2 - - 2 2 30 - 70 - 100
Objectives: -
● Apply quantitative modeling and data analysis techniques to the solution of real world
business problems, communicate findings, and effectively present results using data
visualization techniques.
● Recognize and analyze ethical issues in business related to intellectual property, data
security, integrity, and privacy.
Prerequisites: -NA
Course outline:-
Sr. Course Contents Number
No. of Hours
1 Data Science- An Overview 6
Introduction to Data Science, Definition and description of Data Science,
history and development of Data Science, terminologies related with Data
Science, basic framework and architecture, difference between Data Science
and business analytics, importance of Data Science in today’s business world,
primary components of Data Science, users of Data Science and its hierarchy,
overview of different Data Science techniques, challenges and opportunities in
business analytics, different industrial application of Data Science techniques.
2 Mathematics and Statistics in Data Science 6
Role of mathematics in Data Science, importance of probability and statistics
in Data Science, important types of statistical measures in Data Science :
Descriptive, Predictive and prescriptive statistics, introduction to statistical
inference and its usage in Data Science, application of statistical techniques in
Data Science, overview of linear algebra : matrix and vector theory, role of
linear algebra in Data Science, exploratory data analysis and visualization
techniques, difference between exploratory and descriptive statistics, EDA and
visualization as key component of Data Science
3 Machine Learning in Data Science 6
Role of machine learning in Data Science, different types of machine learning
techniques and its broad scope in Data Science : Supervised, unsupervised,
reinforcement and deep learning, difference between different machine
learning techniques, brief introduction to machine learning algorithms,
importance of machine learning in today’s business, difference between
machine learning classification and prediction.
4 Computers in Data Science 6
Role of computer science in Data Science, various components of computer
science being used for Data Science, role of relation data base systems in Data
Science: SQL, NoSQL, role of data warehousing in Data Science, terms
related with data warehousing techniques, importance of operating concepts
and memory management, various freely avDSlable software tools used in
Data Science : R, Python, important proprietary software tools, different
business intelligence tools and its crucial role in Data Science project
presentation.
5 Data Science Project Management 6
Data Science project framework, execution flow of a Data Science project,
various components of Data Science projects, stakeholders of Data Science
project, industry use cases of Data Science implementation, challenges and
scope of Data Science project management, process evaluation model,
comparison of Data Science project methods, improvement in success of Data
Science project models.
Learning Outcomes:-
● Understands the various components of computer science being used for Data Science
● The class will be taught using theory and case based method. In addition to assigning the
case studies, the course instructor will spend considerable time in understanding the
concept of innovation through the eyes of the consumer. The instructor will cover the
ways to think innovatively liberally using thinking techniques
Books Recommended:-
Text Books:
1. “Data Science from Scratch: First Principles with Python 1st Edition by Joel Grus
2. Principles of Data Science by Sinan Ozdemir, (2016) PACKT of Database Systems”,
Fourth Edition, Pearson/Addision Wesley, 2007
3. Data Science for Dummies by Lillian Pierson (2015)
Reference Books:
1. Data Science for Business: What You Need to Know about Data Mining and Data-
Analytic Thinking by Foster Provost, Tom Fawcet
2. Data Smart: Using Data Science to Transform Information into Insight 1st Edition by
John W. Foreman. (2015) Wiley Publication
MODULE 1: Data Science - An Overview
Objective:
The objective of this assignment is to provide students with a
comprehensive understanding of Data Science, its history,
development, and the fundamental components involved in the Data
Science process. The assignment also aims to highlight the
importance of Data Science in today's business world, explore
different Data Science techniques, and analyze their applications
across various industries.
Assignment Questions:
Submission Guidelines:
- The assignment should be typed and submitted as a document
(e.g., MS Word or PDF).
- Clearly label each question and provide the corresponding answers.
- Use appropriate headings, subheadings, and bullet points for
clarity.
- Cite your sources whenever you use external references or
examples.
11. Choose one industry of your choice and describe how Data
Science techniques have been applied to solve real-world problems
in that industry. Provide specific examples to support your answer.
12. Imagine you are part of a Data Science team working for a retail
company. Explain how you would use Data Science techniques to
analyze customer purchase patterns and recommend personalized
products to customers.
CASE STUDIES
Case Study 1: Healthcare Industry - Predictive Analytics for Patient
Readmission
Background:
A leading hospital chain aims to reduce patient readmissions, which
not only impact the quality of care but also result in increased
healthcare costs. The hospital management wants to leverage Data
Science techniques to predict which patients are at high risk of
readmission, enabling timely interventions and personalized care
plans.
Objective:
Develop a predictive analytics model using Data Science techniques
to identify patients at high risk of readmission.
Data:
The hospital has collected electronic health records (EHR) of patients
over the past five years, including demographics, medical history,
diagnostic tests, medications, and previous hospitalization details.
Challenges:
1. Dealing with imbalanced data where the majority of patients do
not experience readmission.
2. Ensuring privacy and compliance with patient data while
performing analysis.
Solution:
1. Data Preprocessing: Clean and preprocess the EHR data, handling
missing values and converting categorical variables into numerical
formats.
Background:
A large e-commerce company wants to understand its customer base
better and tailor marketing strategies to different customer
segments. They have vast amounts of transactional data but lack
insights into customer behavior and preferences.
Objective:
Segment customers based on their buying behavior and preferences
to create targeted marketing campaigns.
Data:
The company has a database of historical transactions, including
customer IDs, products purchased, transaction dates, and order
values.
Challenges:
1. Dealing with high-dimensional and noisy transaction data.
2. Identifying an optimal number of customer segments for effective
targeting.
Solution:
1. Data Preparation: Transform transactional data into a customer-
product matrix with customer IDs as rows and products as columns,
and populate it with binary values (1 for purchased, 0 for not
purchased).