0% found this document useful (0 votes)
15 views

Advanced Certificate Program in Data Science and AI Curriculum v1.0

Uploaded by

archana dande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Advanced Certificate Program in Data Science and AI Curriculum v1.0

Uploaded by

archana dande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

ADVANCED CERTIFICATE PROGRAM IN

DATA SCIENCE AND AI


IN COLLABORATION WITH

Course Curriculum
Advanced Certification Course in Data Science and AI

COURSE 1 - Python for Data Science and AI

9 MODULES

M1 - Introduction to Data Science

M2 - Data Collection and Cleaning

M3 - Python Fundamentals

M4 - Control Flow and Functions

M5 - Array Computations using NumPy

M6 - Data Manipulation using Pandas

M7 - Visualizing Data using Matplotlib and Seaborn

M8 - Web Scraping

M9 - End-Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 2 - Predictive Analytics

8 MODULES

M1 - Introduction to Statistical Analysis

M2 - Exploratory Data Analysis

M3 - Introduction to Probability

M4 - Probability Distribution Functions

M5 - Inferential Statistics - I

M6 - Inferential Statistics - II

M7 - Regression

M8 - End-Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 3 - Machine Learning

13 MODULES

M1 - Introduction to Machine Learning

M2 - Supervised Learning - Regression

M3 - Evaluating Regression Models

M4 - Supervised Learning – Classification

M5 - Decision Tree and Random Forest Models

M6 - Mathematical and Bayesian Models

M7 - Dimensionality Reduction

M8 - Unsupervised Learning using Clustering

M9 - Model Evaluation and Hyperparameter Tuning

M10 - Model Boosting and Optimization

M11 - Association Rule Mining and Recommendation Engines

M12 - Time Series Analysis

M13 - End Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 4 - Mid Program Project

1 MODULE

Mid Program Project

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 5 - Machine Learning on Cloud

5 MODULES

M1 - Cloud Computing and AWS Foundations

M2 - Conversational AI Development with SageMaker,

Lex and Polly

M3 - Image and Data Analysis with AWS Rekognition,

Textract, and Quicksight

M4 - Machine Learning Model Deployment in AWS

M5 - End Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 6 - Natural Language Processing

9 MODULES

M1 - Introduction to NLP

M2 - Text Pre-processing

M3 - Analyzing Sentence Structure

M4 - Text Classification

M6 - Building a Resume Classifier (Self-paced)

M7 - Building a intent based RASA Chatbot (Self-paced)

M8 - NLP in Production (Self-paced)

M9 - End Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 7 - Deep Learning

9 MODULES

M1 - Introduction to Deep Learning

M2 - Getting started with Tensorflow 2.0 with Tensor Board

M3 - Neural Networks with TensorFlow 2.x

M4 - Deep Learning for Images using CNN

M5 - TensorFlow Hub for Object Detection using Faster RCNN

M6 - Object Detection Using OpenCV - Part 1

M7 - Object Detection Using OpenCV - Part 2

M8 - Deep Learning for Sequences using RNN (Self-paced)

M9 - End Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 8 - Data Warehousing and Big Data Storage

11 MODULES

M1 - Data Warehousing

M2 -Data Integration and ETL

M3 -Getting Started with Big Data

M4 - Storing Big Data in a Distributed Cluster

M5 - Data Mining (Self-paced)

M6 - Frequent Pattern Mining (Self-paced)

M7 - Data Ingestion in Hadoop using Sqoop and Flume (Self-

paced)

M8 - Spark’s Big Data Engine and RDD Concepts (Self-paced)

M9 - Relational Data Processing with Spark - Spark SQL (Self-

paced)

M10 - Machine Learning with Spark - Spark ML (Self-paced)

M11 - End Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 9 - Data Visualization using Tableau

5 MODULES

M1 - Data Connection and Visualization in Tableau

M2 - Calculations in Tableau

M3 - Advanced Visualizations

M4 - Sharing Your Insights Through Dashboards

M5 - End Course Assessment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

COURSE 10 - Capstone Project

1 MODULE

Capstone Project

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

BONUS COURSE - SQL Essentials for Data Science and

AI (Self - paced)

6 MODULES

M1 - Introduction to RDBMS and MySQL

M2 - Database Modeling

M3 - Creating Databases and Tables

M4 - Querying and Modifying Tables

M5 - Joins and Functions in MySQL

M6 - Database Integration with Python

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

BONUS COURSE - Mastering ChatGPT (Self-Paced)

5 MODULES

M1 - Introduction to OpenAI and ChatGPT

M2 - Business Use Cases of ChatGPT

M3 - Deploying and Integrating ChatGPT in Business

Applications

M4 - GPT Models, Pre-processing and Fine-tuning ChatGPT

M5 - Working with GPT-3 and OpenAI API

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

BONUS COURSE - Sequence Learning (Self-paced)

7 MODULES

M1 - Introduction to Sequence Learning

M2 - RNN vs LSTM with Google Stock Price

M3 - Sentiment Analysis on Zomato Reviews using LSTM

M4 - Introduction to the Transformer Model

M5 - BERT and GPT2 using Transformer

M6 - Machine Translation with MT5

M7 - Building a Question-Answer Prediction Model using BERT

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Advanced Certification Course in Data Science and AI

BONUS COURSE - Power BI for Data Visualization

(Self-paced)

3 MODULES

M1 - Power BI Desktop and Data Transformation

M2 - Data Analysis Expression (DAX)

M3 - Data Visualization and Power BI Service

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Science using Python
Course Curriculum

Module 1 - Introduction to Data Science


Understanding Data Science
Myths about Data Science
Applications of Data Science
Market Demand for Data Scientists
Job Roles and Responsibilities
Skillset Required for Data Scientists
Python for Data Science
Data Science Life Cycle
Case Study: Diabetes Prediction

Module 2 - Data Collection and Cleaning


Data Science Methodology: A Quick
Review
Introduction to Data Collection
Data Collection Process
Sources of Data in Organizations
Instruments for Collecting Data
In-Class Activity: Student Survey
Significance of Clean Data
Data Cleaning and Pre-Processing Techniques
Data Cleaning
Data Integration
Data Transformation
Data Reduction
Case Study: Cleaning Diabetes Data

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Science using Python

Module 3 - Python Fundamentals


What is Python?
Applications of Python
Companies Using Python
Python Fundamentals
Syntax Rules
Indentation
Tokens
Data Types
Boolean
Mutable and Immutable Data Types
Importing Modules in Python

Module 4 - Control Flow and Functions


Control Flow in Python
Conditional Statements
Loops
While Loop
for Loop
Nested Loop
Loop Control Statements
Range Function in Python
Functions and Arguments
File Handling in Python
Classes and Objects
Variable Scope and Global Keyword
Exceptions in Python

Module 5 - Array Computations using NumPy


Introduction to NumPy
NumPy Array
Array Creation Routines
Basic Operations
Arithmetic Operators
Matrix Product

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Science using Python

Functions
Universal Functions
Aggregate Functions
Logic Functions
Indexing
Fancy Indexing
Slicing
Iterating in a NumPy Array
Array Manipulation
File Handling Using NumPy

Module 6 - Data Manipulation using Pandas


Introduction to Pandas
Data Structures in Pandas
Importing and Exporting Data
Essential Functionality of Series and DataFrame
Combining Data
Cleaning Data
Grouping Data

Module 7 - Visualizing Data using Matplotlib and Seaborn


Why Data Visualization?
The Matplotlib Library
Types of Plots and Charts
Line Plot
Bar Plot
Histogram
Pie Chart
Scatter Plot
Boxplot
Saving Charts
Customizing Visualizations
Saving Plots
Grid and Subplots
Other Visualization Libraries

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Science using Python

Module 8 - Web Scraping


What is Web Scraping?
Use Cases of Web Scraping
Need for Web Scraping in Data
Science
Web Scraping Process Flow
Popular Tools for Web Scraping in
Python
Requests
Beautiful Soup
Scrapy
Beautiful Soup vs. Scrapy
Web Scraping Challenges

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Predictive Analytics
Course Curriculum

Module 1 - Introduction to Statistical Analysis


Data and Information
Introduction to Data Types
Categorical Data
Numerical Data
Variables
Introduction to Statistics
Importance of Statistical
Analysis in Data Science
Statistical Analysis Divisions
Population & Sample
Descriptive Statistics
Measures of Central Tendency
Measures of Dispersion
Measures of Position
Case Study

Module 2 - Exploratory Data Analysis


Exploratory Data Analysis (EDA)
EDA Techniques
EDA Classification
Univariate Non-graphical EDA
Categorical Data
Numerical Data
Univariate Graphical EDA
Multivariate Non-graphical EDA
Cross Tabulation
Covariance and Correlation

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Predictive Analytics

Multivariate Graphical EDA


Scatter Plots
Heat Maps
Graphical EDA Techniques
Case Study

Module 3 - Introduction to Probability


Introduction to Probability
Define probability
Common terminology of probability
Events
Union
Intersection
Types of Events
Rules of Probability
Types of Probability
Marginal
Joint.
Conditional
Odds and Log Odds
Bayesian Inference
Bayes Theorem
Random Variable
General Product Rule

Module 4 - Probability Distribution Functions


General Concepts of Probability Distributior
Discrete Probability Distribution
Continuous Probability Distributions
Central Limit Theorem

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Predictive Analytics

Module 6 - Inferential Statistics - II


One Sample Z-test
One Sample T-test
Independent Sample T-test
Chi-Square Test
Regression
ANOVA

Module 7 - Regression
Use Case: House Price Prediction
Introduction to Regression
Linear Regression
• Simple Linear Regression
• Multiple Linear Regression
Evaluation Metrics in Regression Models
Use Case: Predict University Graduate Admissions
Logistic Regression
Use Case: Predict Ad-Click Using Logistic Regression
Regularisation: Ridge, Lasso, & Elastic Net
Use-Case: Stock Shoes Price Prediction

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning
Course Curriculum

Module 1 - Introduction to Machine Learning


What is Machine Learning?
Machine Learning Processes
Al vs. Machine Learning vs. Deep Learning
Significance of Machine Learning
Applications of Machine Learning
Myth about Machine Learning
Types of Machine Learning
Data Pre-processing Techniques
Train/Test split method

Module 2 - Supervised Learning - Regression


Classification of Supervised Learning
Algorithms
Regression
Linear Regression
Assumptions of Linear Regression
Types of Linear Regression
OLS Regression Results Summary
Calculation of R2
Gradient Descent
Regularization techniques
Case Study

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning

Module 3 - Evaluating Regression Models


Model Evaluation
Scenario - BMI Prediction
Bias-Variance Trade-off
Learning and Validation Curves
Techniques for Evaluating
Regression Models
Relative Standard Deviation
Relative Squared Error
MeanAbsolute Error
RelativeAbsolute Error
Mean Squared Error
Root Mean Squared Error on
Prediction
R-Square

Module 4 - Supervised Learning - Classification


What is Classification?
Classification vs. Regression
Types of ClassificationAlgorithms
Logistic Regression
What is Logistic Regression?
Log Odds
Logistic Regression Cost Function
Maximum Likelihood
Evaluation Parameters

Module 5 - Decision Tree and Random Forest Models


Decision Tree
Common Terminologies
Decision Iree using CART
Algorithm
Decision Tree using ID3 Algorithm
Attribute Selection
Random Forest

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning

Module 6 - Mathematical and Bayesian Models


Naive Bayes Classification
Revisiting Bayes' Theorem
Likelihood
K-Nearest Neighbors (K-NN)
Distance Metric
Standardization (Normalization, Z-score)
Choosing K
Support Vector Machines (SVM)
Linear SVM Classification
Non-Linear SVM Classification
SVM Regression
Kernel SVM
Case Study

Module 7 - Dimensionality Reduction


Curse of Dimensionality
What is Dimensionality Reduction
Why Dimensionality Reduction
Feature Selection and Extraction
Principal ComponentAnalysis
Eigen Vector/Singular Vector
Eigen Value/ Singular Value
Scree Plot
Linear DiscriminantAnalysis (LDA)
Other Dimensionality Reduction Techniques

Module 8 - Unsupervised Learning using Clustering


What is Unsupervised Learning?
What is Clustering?
Types of Clustering
Hierarchical Clustering

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning

Agglomerative Clustering
Division Clustering
Means Clustering
Euclidean Distance
Elbow Method
Fuzzy C-Means Clustering
DBSCAN Clustering

Module 9 - Model Evaluation and Hyperparameter


Tuning
Model Selection
Resampling Techniques
Need for Model Evaluation
Metrics for evaluating Regression Models
Metrics for evaluating Classification Models
Hyperparameter Tuning

Module 10 - Model Boosting and Optimization


Ensemble Learning
Bagging
Boosting
AdaBoost
Gradient Boosting
XGBoost.
CatBoost
Model Optimization
Elements of Optimization
Linear Programming
Examples
Applications
Formulating Optimization
Accelerated Gradient Methods
Second-Order Methods

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning

Module 11 -Association Rule Mining and


Recommendation Engines (Self-Paced)
Association Rule Mining
Support
Confidence
Lift
Apriori Algorithm
Market BasketAnalysis
Recommendation Engine
User-Based Collaborative Filtering (UBCF)
Content-Based Filtering (CBF)

Module 12 -Time Series Analysis (Self-Paced)


Time Series Analysis
Components of Time Series
Types of Data
Stationary Data
Non-Stationary Data
Checks for Stationarity of Data
Augmented Dicky Fuller Test
Convert Non-Stationary Data to
Stationary Data
Differencing
Seasonal Differencing
Transformation
Time Series Analysis Model
AR → MA → ARMA
ARIMA

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Mid Program Project
Course Curriculum

Project : Movie Recommendation System

Over the past two decades, there has been a monumental shift in how people access
and consume video content. With the universal access to broadband internet,
numerous platforms like YouTube, Netflix, HBO Go emerged and steadily grew to
prominence. Although not a household name in itself, OTT is the exact technology
that made the streaming revolution possible.

OTT stands for Over The Top, refers to any video streaming service delivering content
to the users over the internet, however there are subscription charges associated
with the usage of such platforms such as PrimeVideo, Netflix, HotStart, Zee5, SonyLiv
etc. But choosing your next movie to watch can still be a daunting task, even if you
have access to all the platforms.

“MyNextMovie” is a budding startup in the space of recommendations on top of


various OTT platforms providing suggestions to its customer base regarding their next
movie. Their major business is to create a recommendation layer on top of these OTT
platforms so that they can make suitable recommendations to their customers,
however, since they are in research mode right now, they would want to experiment
with open source data first to understand the depth of the models which can be
delivered by them.

The data for this exercise is open source data which has been collected and made
available from the MovieLens web site, a part of GroupLens Research The data sets
were collected over various periods of time, depending on the
size of the set.

You have recently joined as a Data Scientist at “MyNextMovie” and plan to help the
existing team to set up a recommendation platform.

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning on Cloud
Course Curriculum

Module 1 - Cloud computing and AWS Foundations


Classic Data Center
Virtualization
Cloud and Cloud Computing
Cloud Computing Service Models
Cloud Computing Deployment Models
Service Comparison: AWS, Azure, and GCP
Amazon Web Services (AWS) and its Benefits
AWS Global Infrastructure
AWS Regions and Replication of data between the Regions
Availability Zones and High Availability
PoP Locations
Signup an AWS Free Tier Account
Different Amazon Web Services
Ways to access AWS: CLI, Console, and SDKs
Explore Management Console and Configure AWS CLI
AWS CloudShell

Module 2 - Conversational Al Development with SageMaker, Lex, &


Polly
Machine Learning on AWS cloud platform
Amazon SageMaker
Amazon SageMaker components
What are Chatbots?
Amazon Lex
Key concepts of Amazon Lex
Working of Amazon Lex
Amazon Polly
How Amazon Polly works?

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Machine Learning on Cloud

Module 3 - Image and Data Analysis with AWS Rekognition,


Textract,
and Quicksight
Introduction to Image Analysis
Amazon Textract
Amazon Rekognition
Image and Video Analysis with Amazon Rekognition
What is Business Intelligence?
Amazon QuickSight: Cloud Based BI

Module 4 - Machine Learning Model Deployment


in AWS
Use-Case: ML in Cloud - Vehicle Price Prediction
Why Use Machine Learning (ML) in Cloud?
ML Cloud Platforms
Amazon ML vs Google Cloud AutoML vs IBM Watson
Companies Using AWS SageMaker
Machine Learning With AWS SageMaker
Build, Train, and Deploy a ML Model

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Natural Language Processing
Course Curriculum

Module 1 - Introduction to NLP


Overview of Text Mining
Need of Text Mining
Natural Language Processing (NLP) in Text Mining
Applications of Text Mining
OS Module
Reading, Writing to text and word files
Setting the NLTK Environment
Accessing the NLTK Corpora

Module 2 - Text Pre-processing


Tokenization
Frequency Distribution
Different Types of Tokenizers
Bigrams, Trigrams and N-grams
Stemming
Lemmatization
Stop Words
POS Tagging
Named Entity Recognition

Module 3 - Analyzing Sentence Structure


Syntax Trees
Chunking
Chinking
Context Free Grammars (CFG)
Automating Text Paraphrasing

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Natural Language Processing

Module 4 - Text Classification


Bag of Words
Count Vectorizer
Term Frequency (TF)
Inverse Document Frequency (IDF)
Converting text to features and labels
Multinomial Naive Bayes Classifier
Leveraging Confusion Matrix

Module 5 - Building a Resume Classifier


Use Case: Analysing and Classifying Resumes
Introduction to Tokenization
PhraseMatcher in Spacy
Demo: Analysing and Classifying Resumes Using Spacy

Module 6 - Building a intent based RASA Chatbot (Self-paced)


Chatbot
Intent
Entity
Story
Action
Tips for Chatbot
Chatbot Tools
RASA - NLU
RASA - Core

Module 7 - NLP in Production


Sentiment Analysis
Important Sentiment Analysis Library
Tweepy
Textblob
Polarity
Sentiment
Deployment

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Deep Learning
Course Curriculum

Module 1 - Introduction to Deep Learning


Why Deep Learning?
What is Deep Learning?
Curse of Dimensionality
Machine Learning vs Deep Learning
Use cases of Deep Learning
Human Brain vs Neural Network
What is Perceptron?
Learning Rate
Epoch
Batch Size
Activation Function
Single Layer Perceptron

Module 2 -Getting started with Tensorflow 2.x with Tensor Board


Limitations of Single Layer
Perceptron
Importance of Multilayer
Perceptron
Pre-activation function
Activation function
Step by step Multilayer perceptron
implementation
Digit Classification using Simple
Neural Network in TensorFlow 2.x
Improving the model
Adding Hidden Layer
Adding Dropout
Using Adam Optimizer
Introduction to Tensorboard

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Deep Learning

Module 3 - Neural Networks with TensorFlow 2.x


Need of Deep Learning
Neural Networks and their working
Artificial Neural Networks (ANN)
Types of Layers
Multilayer Perceptron (MLP)
Activation Functions
Forward Propagation
Back Propagation
Gradient Descent and Cost Function

Module 4 - Deep Learning for Images using CNN


Limitations of MLP
Introduction to Visual Cortex
What are CNNs?
CNN Layers and Components
Convolutional Layers
Pooling Layers
ReLu
Fully Connected Layers

Module 5 - TensorFlow Hub for Object Detection using Faster


RCNN
Regional-CNN
Selective Search Algorithm
Bounding Box Regression
SVM in RCNN
Pre-trained Model
Model Accuracy
Model Inference Time
Model Size Comparison
Transfer Learning

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Deep Learning

Object Detection - Evaluation


mAP
loU
RCNN - Speed Bottleneck
Fast R-CNN
Rol Pooling
Fast R-CNN - Speed Bottleneck
Faster R-CNN
Feature Pyramid Network (FPN)
Regional Proposal Network
(RPN)
Mask R-CNN
YOLO

Module 6 - Object Detection Using OpenCV - Part I


What is Object Detection?
How does object detection work?
Types of Object detection algorithm
Two-shot detection
Single-shot detection
Haar Cascades
Installation of OpenCV library
Recognition or object detection in the image

Module 7 - Object Detection Using OpenCV - Part 2


YOLO object detection algorithm
Non-maximum Suppression
Implementation of YOLO
Custom Object detection with YOLO

Module 8 - Deep Learning for Sequences using RNN (Self-Paced)


Drawback of Feed Forward Network
What is RNN?
Architectures of RNN

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Deep Learning

Working of RNN
Back-propagation through time
Vanishing Gradient Descent
Exploding Gradient Descent
What is LSTM?
Gates used in LSTM
Working of LSTM

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Warehousing and
Big Data Storage Course Curriculum

Module 1 - Data Warehousing


Why Data Warehousing?
Data Warehouse Defined
Dimensional Modelling
Data Warehouse Environment
Dimensions and Facts
Hierarchical Nature of Dimensions
OLAP vs OLTP
Data Mart
Schema Design
Star Schema
Snowflake Schema
Fact Constellations
Ralph Kimball Model
Designing a Data Warehouse

Module 2 - Data Integration & ETL


Need of an ETL Process
Extract, Transform and Load
Process
Data Cube
Main Challenges of ETL
Incremental Load vs. Full Load
Best Practices of ETL Pipeline
Monitoring/Managing ETL
pipeline
Data Quality
Data Warehousing Middleware
Business Analysis: OLAP Features
OLAM: Data Warehousing to
Mining

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Warehousing and Big Data Storage

Module 3 - Getting Started with Big Data


Global Data Explosion
Structure of Data
What is Big Data?
5 V's of Big Data
What is Big Data Analytics?
Applications of Big Data
Problems with Storage & Processing of
Big Data
Hadoop 1.x vs. Hadoop 2.x
Apache Hadoop
Hadoop 2.x Components
Hadoop YARN
YARN Components
Application submission in
YARN
Hadoop Cluster Architecture
Hadoop Cluster Modes
Hadoop Ecosystem

Module 4 - Storing Big Data in a Distribute Cluster


Distributed File System (DFS)
Need of DFS
Characteristics of DFS
Data Storage in HDFS using blocks
Fault Tolerance
Replication Factor
Block Replication
Rack Awareness
Load Balancing
HDFS Balancer
Disk Balancer
HDFS Read/Write Mechanism
Hadoop Configuration Files
HDFS Commands

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Warehousing and Big Data Storage

Advantages and Limitations of HDFS


MapReduce (MR)
Basic Terminology of MR
Logical Dataflow in MR
Map Task
Reduce Task
InputSplit
RecordReader
Job Submission Flow in MR

Module 5 - Data Mining (Self-Paced)


Need of Data Mining
What is Data Mining?
Application of Data Mining
Data Mining Process
What Kind of Data are Mined?
What Kind of Pattern are Mined?
Technologies in Data Mining
Core Ideas in Data Mining
Bias-Variance Trade-off
Data Exploration
Data Mining vs. Machine learning
Generative vs. Discriminative Methods
Classification Methods
K-Nearest Neighbours
Naïve Bayes’
Rule Based Classification

Module 6 - Frequent Pattern Mining


What is Frequent Pattern Mining?
Affinity Analysis
Eclat Algorithm
Eclat vs. Apriori Algorithm
Cluster Analysis
K-Means Clustering

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Warehousing and Big Data Storage
Hierarchical Clustering
Network Analysis
Centrality measures
Basic to Advanced Text Mining
Count/ IDF/ TF-IDF Model
Named Entity Recognition

Module 7 - Data Ingestion in Hadoop using Sqoop and Flume


Introduction to Data Ingestion
Hadoop Data Ingestion
Data Ingestion Tools in Hadoop
File Formats
Apache Sqoop
Features of Sqoop
Sqoop Architecture and Working
Sqoop Import
Sqoop Export
Sqoop Connectors
Sqoop Import Control Arguments
Sqoop Export Control Arguments
Sqoop Incremental Import
Sqoop Eval
Import All Tables using Sqoop
Sqoop Jobs
Apache Flume
Features of Flume
Flume Architecture
Events, Clients, & Agents
Interaction between Flume Components
Comparison between Sqoop and Flume

Module 8 - Spark’s Big Data Engine and RDD Concepts


In-memory Processing Systems
Comparison between MR and Spark
Apache Spark
Spark in Hadoop Ecosystem
Spark Components

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Warehousing and Big Data Storage
Spark: Master/Slave Architecture
Different Job Running Modes of a
Spark Application
Spark Shell using Python – PySpark
Resilient Distributed Datasets (RDDs)
Parallelized Collections and External Datasets
RDD Operations
Transformations
Actions
General RDD Functions
Key-Value Pair RDD Functions
RDD Partitions and Parallelism
RDD Caching and Persistence
Shared Variables in Spark
Broadcast Variables
Accumulators

Module 9 - Relational Data Processing with Spark: Spark SQL


Need of Spark SQL
Spark SQL Advantages over Hive
What is Spark SQL?
Features of Spark SQL
Spark SQL Architecture
Spark SQL Libraries: Data Source API
Spark SQL Libraries: DataFrame API
SparkContext and SparkSession
Creating DataFrames
DataFrame Operations
User Defined Functions (UDFs)
Interoperating with RDDs
Inferring the Schema using Reflection
Programmatically Specifying the Schema
Parquet Files
JSON Datasets
Introduction to Spark SQL Optimization
Catalyst Optimizer
Performance Tuning
In-memory Caching

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Warehousing and Big Data Storage

Module 10 - Machine Learning with Spark: Spark ML


Machine Learning in Spark
Spark MLlib vs Spark ML
Spark ML Overview
Transformers
Estimators
Pipeline
Feature Extraction, Selection and Transformation
Spark ML Algorithms & Utilities
Regression
Classification
Clustering
Model Selection & Tuning

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Visualization using
Tableau Course Curriculum

Module 1 - Data Connection and Visualization in Tableau


Data Visualization
Business Intelligence tools
VizQL Technology
Connect to data from File
Connect to data from Database or server
Basic Charts
Bar Chart
Line Chart
Pie Chart
Chart Operations
Hierarchies
Data Granularity
Highlighting
Sorting
Grouping
Filtering

Module 2 - Calculations in Tableau


Combining Data
Data Blending
Joins
Calculations
Built-in Functions
Table calculations
Parameters & Input Controls
Level Of Detail (LOD) Calculations
Include
Exclude
Fixed

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Data Visualization using Tableau

Module 3 - Advanced Visualizations


Trend lines
Reference lines
Forecasting
Clustering
Geographic Maps
Types of Maps
Spatial Files
Web Map Services

Module 4 - Sharing Your Insights Through Dashboards


Using charts effectively
Dashboards
Dashboard Objects
Building a Dashboard
Dashboard Layouts and Formatting
Interactive Dashboards with Actions
Dashboards for Devices
Story Points
Visual Best Practices
Publish to Tableau Online

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Capstone Project
Course Curriculum

Capstone Project

Heart Disease is among the most prevalent chronic diseases in the United
States, impacting millions of Americans each year and exerting a significant
financial burden on the economy. In the United States alone, heart disease
claims roughly 647,000 lives each year — making it the leading cause of
death. The buildup of plaques inside larger coronary arteries, molecular
changes associated with aging, chronic inflammation, high blood pressure,
and diabetes are all causes of and risk factors for heart disease.

The Centers for Disease Control and Prevention has identified high blood
pressure, high blood cholesterol, and smoking as three key risk factors for
heart disease. Roughly half of Americans have at least one of these three risk
factors. The National Heart, Lung, and Blood Institute highlights a wider array
of factors such as Age, Environment and Occupation, Family History and
Genetics, Lifestyle Habits, Other Medical Conditions, Race or Ethnicity, and
Sex for clinicians to use in diagnosing coronary heart disease. Diagnosis tends
to be driven by an initial survey of these common risk factors followed by
bloodwork and other tests.

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


SQL Essentials for Data Science
and AI (Self-paced) Course Curriculum

Module 1 - Introduction to RDBMS and MySQL


MySQL products
Basics of Relational Databases
MySQL client-server model
MySQL Workbench

Module 2 - Database Modeling


Database Modeling
Normalizing the Data Model,
Creating a database using MySQL
Evaluates the database design.

Module 3 - Creating Databases and Tables


Creating a Database
Creating a Table
Showing How a Table Was Created
Table Options
Column Options
Indexes, Keys, and Constraints
Deleting database and tables
Creating New Table Using an Existing Table
Creating a Temporary Table
Copying an Existing Table Structure
Adding, removing and modifying table columns and indexes

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


SQL Essentials for Data Science and AI

Module 4 - Querying and Modifying Tables


The SELECT Statement
Creating Views
Querying Data from an Application
Exporting and Importing a Delimited File
Troubleshooting: Authorization Levels
Manipulating Data and inserting records
Replacing and updating existing records
Deleting records
Exporting and importing a script
Multistatement Transactions
Modifying Data from an Application

Module 5 - Joins and Functions in MySQL


Querying Multiple Tables
Joining Tables with SELECT
Inner Joins
Outer Joins
Table Name Aliases
Functions in MySQL Expressions
Using Functions
String Functions
Date and time functions
Numeric Functions
Aggregate Functions
Spaces in Function Names

Module 6 - Database Integration with Python


Basics of DBMS
Integrate databases with Python.

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Mastering ChatGPT (Self-paced)
Course Curriculum

Module 1 - Introduction to ChatGPT and OpenAl


Scenario: How can Generative Al be Leveraged?
Emergence of ChatGPT
Introduction to Chatbots
ChatGPT: The Al Phenomenon
Unlocking the Mystery Behind ChatGPT
Introduction to OpenAl
Overview of OpenAl's GPT models
Environment setup
OpenAl API
Log-in Process: OpenAl APl

Module 2 - Business Use Cases of ChatGPT


Using ChatGPT for live coding
Build, Optimize, and Scale businesses using ChatGPT
Advanced SEO for digital marketers
Creating Social Media posts with ChatGPT
Using ChatGPT for Language Translation
Code Generation and Code Debugging with ChatGPT
Content Creation with ChatGPT
Customer Support with ChatGPT
Email Marketing with ChatGPT
Sentiment Analysis using ChatGPT

Module 3 - Deploying & Integrating ChatGPT in Business Applications


Integrate ChatGPT with Power Automate
Integrate ChatGPT with Power Apps
Create serverless ChatGPT
Deployment on cloud platforms

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Mastering ChatGPT (Self-paced)

Module 4 - GPT Models, Pre-processing and Fine-tuning ChatGPT


(Self-Paced)
Overview of Language Models
Understanding the Architecture of the GPT Model
GPT Models: Advantages and Disadvantages
Training of ChatGPT
Overview of the Pre-trained GPT Models Available for Fine-tuning
Fine-tuning the GPT-3 Model
Data Preparation
Hyperparameter Tuning

Module 5 - Working with GPT-3 and OpenAl API


Introduction to GPT-3
Democratizing NLP
Understanding prompts, completions, and tokens
Introducing the Playground
Understanding general GPT-3 use cases
Understanding semantic search
Understanding GPT-3 risks
Understanding APls
Getting familiar with HTTP
Reviewing the OpenAl APl endpoints
Introducing CURL and Postman
Understanding APl authentication
Making an authenticated request to the OpenAl APl
Introducing JSON
Using the Completions endpoint
Using the Semantic Search endpoint

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Sequence Learning (Self-paced)
Course Curriculum

Module 1 - Introduction to Sequence Learning


What is Sequence Learning?
Application of Sequence Learning
What is Sequence Model?
Bayesian network
Markov Model
What is Markov Model?
Markov chain
Markov chain: Use Case
Hidden Markov Model
Viterbi Algorithm

Module 2 - Google Price Prediction using RNN and GRU


Issues with Feed Forward Network
Recurrent Neural Network (RNN)
Architecture of RNN
Calculation in RNN
Backpropagation and Loss
calculation
Applications of RNN
Vanishing Gradient
Exploding Gradient
What is GRU?
Components of GRU
Update gate
Reset gate
Current memory content
Final memory at current time step

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Sequence Learning (Self-paced)

Module 3 - Sentiment Analysis on Zomato Reviews using LSTM


What is LSTM?
Structure of LSTM
Forget Gate
Input Gate
Output Gate
LSTM architecture
Types of Sequence Based Model
Sequence Prediction
Sequence Classification
Sequence Generation
How to increase the
efficiency of the model?
Backpropagation through
time
Workflow of BPTT
Types of LSTM
Vanilla LSTM
Stacked LSTM
CNN LSTM
Bidirectional LSTM

Module 4 - Introduction to Transformer Model


What is Transformer
Why use Transformer
Encoder - Decoder
Hugging Face Transformer
Creating a Pipeline
Sentiment Analysis
Question Answer
Text Classification
Language Translator
Mask Filling

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Sequence Learning (Self-paced)

Module 5 - BERT and GPT2 using Transformer


Transformer vs GPT vs BERT
Bert - Model
BERT Architecture
Pretrained Model
Fine Tuned Model
Single Sentence Classification
Question-Answer Pair Task

Module 6 - Machine Translation with MT5


Demo

Module 7 - Building a Question Answer prediction model using


BERT
Demo

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Power BI Desktop and Data
Virtualization Course Curriculum

Module 1 - Power BI Desktop and Data Transformation


Data Sources in Power Bl Desktop
Loading Data into Power BI Desktop
Views in Power BI Desktop
Query Editor in Power Bl
Transform, Clean, Shape, and Model Data
Manage Data Relationship
Editing a Relationship
Cross Filter Direction
Saving Workfile
Measures

Module 2 - Data Analysis Expressions (DAX)


Introduction to DAX
Importance of DAX
Data Types in DAX
DAX Operators
DAX Calculation Types
Steps to Create Calculated Columns
Steps to Create Calculated Tables
Measures in DAX
DAX Syntax
DAX Functions
DAX Tables and Filtering

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


Power BI Desktop and Data Virtualization

Module 3 - Data Visualization and Power BI Service


Introduction to Visuals in Power BI
Visualization Charts in Power BI
Slicers and Map Visualizations
Matrixes and Tables
Gauges and Single Number Cards
Modifying Colors in Charts And Visuals
Shapes, Text Boxes, and Images
Custom Visuals
Page Layout And Formatting
KPI Visuals and Z-Order
Bookmarks and Selection Pane
Introduction to Power BI Service
Creating a Dashboard
Quick Insights in Power Bl
Configuring a Dashboard
Power BI Q&A
Power Bl Embedded
Bookmarks and buttons

© Brain4ce Education Solutions Pvt. Ltd. All rights Reserved


ADVANCED CERTIFICATE PROGRAM IN

DATA SCIENCE AND AI

www.edureka.co

You might also like