0% found this document useful (0 votes)

8 views10 pages

Unit.1.Introduction To Deep Learning

Uploaded by

studymust00

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views10 pages

Unit.1.Introduction To Deep Learning

Uploaded by

studymust00

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

This looks like a comprehensive set of topics for a final-year deep learning course!

Here are
detailed notes covering each topic, including definitions, types, architectures, numerical
examples, and real-life applications, along with necessary images and tables.

History of Deep Learning

Deep learning, a subset of machine learning, is inspired by the structure and function of the
human brain. While its recent surge in popularity is undeniable, its roots stretch back several
decades.

 Early Beginnings (1940s-1960s): The concept of artificial neurons emerged in the 1940s
with McCulloch and Pitts' model (1943), a simplified mathematical model of a
biological neuron. This laid the theoretical groundwork. Frank Rosenblatt's Perceptron
(1958) was one of the first neural networks, capable of learning to classify patterns.
 AI Winter (1970s-1990s): Limitations of single-layer perceptrons (e.g., inability to solve
the XOR problem, demonstrated by Minsky and Papert in 1969) led to a decline in neural
network research, often referred to as the "AI Winter."
 Resurgence (1980s-1990s): The development of backpropagation (rediscovered by
Rumelhart, Hinton, and Williams in 1986) provided an efficient way to train multi-layer
neural networks, reigniting interest. Convolutional Neural Networks (CNNs) were
introduced by Yann LeCun in the late 1980s for handwritten digit recognition.
 The Deep Learning Revolution (2000s-Present): Several factors contributed to the
explosive growth of deep learning:
o Availability of Large Datasets: The internet and digital technologies led to an
abundance of data.
o Increased Computational Power: The advent of powerful GPUs made it
feasible to train complex deep models.
o Algorithmic Improvements: Advances like rectified linear units (ReLUs),
dropout, and improved optimization techniques addressed challenges like
vanishing gradients.
o Key Milestones: AlexNet's victory in the ImageNet Large Scale Visual
Recognition Challenge (ILSVRC) in 2012, significantly outperforming traditional
methods, is often considered a pivotal moment.

Deep Learning Success Stories

Deep learning has transformed various industries and applications, demonstrating remarkable
performance in tasks previously considered intractable.

 Image Recognition and Computer Vision:

o Facial Recognition: Used in security, social media tagging, and smartphone
unlocking.
o Object Detection: Self-driving cars (identifying pedestrians, vehicles, traffic
signs), surveillance, industrial automation.
o Medical Imaging Analysis: Detecting diseases like cancer, diabetic retinopathy,
and pneumonia from X-rays, MRIs, and CT scans.
 Natural Language Processing (NLP):
o Machine Translation: Google Translate, enabling communication across
language barriers.
o Speech Recognition: Virtual assistants (Siri, Alexa, Google Assistant),
transcribing audio.
o Sentiment Analysis: Understanding opinions and emotions from text data.
o Text Generation: Generating human-like text for chatbots, content creation.
 Recommender Systems:
o Personalized Recommendations: Netflix (movies/shows), Amazon (products),
Spotify (music), tailoring suggestions based on user preferences.
 Generative AI:
o Image Generation: Creating realistic images from text descriptions (DALL-E,
Midjourney).
o Code Generation: Assisting developers by generating code snippets.
o Drug Discovery: Accelerating the design and discovery of new drugs and
materials.
 Gaming and Robotics:
o AlphaGo: DeepMind's AlphaGo defeated the world champion Go player, a
significant AI breakthrough.
o Robotics: Enabling robots to perform complex tasks, navigate environments, and
interact with humans.

Three Classes of Deep Learning Architectures

Deep learning models are broadly categorized into three main classes based on their architecture
and suitability for different types of data and tasks.

1. Feedforward Neural Networks (FNNs) / Multilayer Perceptrons (MLPs):

o Definition: These are the simplest type of artificial neural networks where
connections between nodes do not form a cycle. Information flows in one
direction, from the input layer, through hidden layers, to the output layer.
o Use Cases: Tabular data classification and regression, image classification (for
simpler cases), learning complex non-linear relationships.
o Example: Predicting house prices based on features like area, number of
bedrooms, location.
2. Convolutional Neural Networks (CNNs):
o Definition: Specifically designed to process data that has a known grid-like
topology, such as image data (2D grid of pixels) or time-series data (1D grid).
They use convolutional layers to automatically and adaptively learn spatial
hierarchies of features.
o Use Cases: Image recognition, object detection, video analysis, image generation.
o Example: Classifying images of cats vs. dogs.
3. Recurrent Neural Networks (RNNs):
o Definition: Designed to handle sequential data, where the order of information
matters. Unlike FNNs, RNNs have loops, allowing information to persist from
one step of the sequence to the next, giving them a form of "memory."
o Use Cases: Natural language processing (machine translation, speech
recognition), time series prediction, video analysis (actions over time).
o Example: Predicting the next word in a sentence.

Class of DL Data Type Key Feature Common Applications

Feedforward NN / Tabular, Simple No loops, one-way Classification, Regression,
MLP Image flow Simple Pattern Recog.
Convolutional NN Image, Video, Convolutional layers Image Recognition, Object
Grid Detection
Recurrent NN Sequential, Time- Loops, memory NLP, Speech Recognition,
Series Time Series Forecasting

Basic Terminologies of Deep Learning

Understanding these fundamental terms is crucial for comprehending deep learning concepts.

 Neuron (Node): The basic unit of a neural network, inspired by biological neurons. It
receives inputs, performs a weighted sum, applies an activation function, and produces an
output.
 Activation Function: A non-linear function applied to the output of a neuron. It
introduces non-linearity, enabling the network to learn complex patterns and approximate
any continuous function. Common examples include Sigmoid, ReLU, Tanh.
 Weights: Parameters within a neural network that determine the strength of the
connection between two neurons. They are adjusted during training to minimize the error.
 Bias: An additional parameter in a neuron that allows the activation function to be
shifted. It helps the model fit the data better.
 Input Layer: The first layer of a neural network that receives the raw input data.
 Hidden Layer: Layers between the input and output layers where the network performs
computations and learns features from the data. Deep learning refers to networks with
multiple hidden layers.
 Output Layer: The final layer of the network that produces the prediction or output.
 Loss Function (Cost Function): A function that quantifies the error between the
predicted output of the network and the actual target output. The goal of training is to
minimize this loss. Examples: Mean Squared Error (MSE), Cross-Entropy.
 Optimizer: An algorithm or method used to adjust the weights and biases of the network
to minimize the loss function. Examples: Gradient Descent, Adam, RMSprop.
 Epoch: One complete pass through the entire training dataset during the training process.
 Batch Size: The number of training examples used in one iteration of the optimizer.
 Learning Rate: A hyperparameter that controls how much the weights are adjusted with
respect to the loss gradient during optimization.
 Backpropagation: An algorithm used to efficiently calculate the gradients of the loss
function with respect to the weights and biases in a neural network, enabling their update.
 Hyperparameters: Parameters whose values are set before the training process begins
(e.g., learning rate, number of hidden layers, number of neurons per layer).

Feedforward Neural Network (FNN)

Definition

A Feedforward Neural Network (FNN), also known as a Multilayer Perceptron (MLP), is

the simplest and most fundamental type of artificial neural network. In an FNN, information
flows in only one direction – from the input layer, through one or more hidden layers, to the
output layer. There are no loops or cycles in the network, meaning the output of a neuron does
not affect its own input.

Architecture

An FNN consists of:

 Input Layer: This layer receives the raw input data. Each node in the input layer
corresponds to a feature in the input data.
 Hidden Layers: These are intermediate layers between the input and output layers.
FNNs can have one or more hidden layers. Each neuron in a hidden layer receives inputs
from all neurons in the previous layer, performs a weighted sum, and applies an
activation function. The "deep" in deep learning refers to networks with many hidden
layers.
 Output Layer: This is the final layer that produces the network's prediction. The number
of neurons in the output layer depends on the type of problem (e.g., one neuron for binary
classification/regression, multiple neurons for multi-class classification).

How it works:

1. Input: The input features (x1,x2,…,xn) are fed into the input layer.
2. Weighted Sum: Each neuron in a subsequent layer calculates a weighted sum of its
inputs from the previous layer, adding a bias term. For a neuron j in layer L, receiving
inputs from layer L−1:

zjL=i∑(wijL⋅aiL−1)+bjL

where wijL are the weights connecting neuron i in layer L−1 to neuron j in layer L, aiL−1
are the activations (outputs) of neurons in the previous layer, and bjL is the bias for
neuron j.
3. Activation: The weighted sum (zjL) is then passed through a non-linear activation
function (f).

ajL=f(zjL)

This activation ajL becomes the input to the neurons in the next layer.

4. Output: This process continues until the output layer, which produces the final
prediction.

Numerical Example (Single Neuron)

Let's consider a single neuron (part of an FNN) with two inputs, x1 and x2.

 Inputs: x1=0.5, x2=0.8

 Weights: w1=0.2, w2=0.6
 Bias: b=0.1
 Activation Function: Sigmoid, f(z)=1+e−z1

Calculation:

1. Weighted Sum (z):

z=(x1⋅w1)+(x2⋅w2)+b

z=(0.5⋅0.2)+(0.8⋅0.6)+0.1

z=0.1+0.48+0.1

z=0.68

2. Activation (Output a):

a=f(z)=1+e−0.681

e−0.68≈0.5066

a=1+0.50661=1.50661≈0.6637

So, the output of this neuron would be approximately 0.6637.

Real-life Example

 Credit Scoring: Banks use FNNs to assess the creditworthiness of loan applicants.
Inputs could include income, debt-to-income ratio, credit history, and employment status.
The FNN learns to identify patterns indicative of default risk and outputs a credit score or
a probability of default.
 Spam Detection: Classifying emails as spam or not spam. Inputs could be features
extracted from the email text (e.g., frequency of certain words, sender information,
presence of suspicious links).

Representation Power of Feedforward Neural Network (and Multilayer

Perceptron)

Universal Approximation Theorem

The Universal Approximation Theorem is a fundamental concept illustrating the power of

FNNs (specifically MLPs). It states that a feedforward network with a single hidden layer
containing a finite number of neurons (using a non-linear activation function like sigmoid or
ReLU) can approximate any continuous function to an arbitrary degree of accuracy, given
enough neurons.

 Implication: This means that theoretically, a sufficiently large FNN can learn extremely
complex relationships between inputs and outputs, effectively acting as a universal
function approximator.
 Why hidden layers are crucial: Without hidden layers or with only linear activation
functions, an FNN can only learn linear relationships, severely limiting its capabilities.
Non-linearity introduced by activation functions in hidden layers allows the network to
capture intricate, non-linear patterns in data.
 Depth vs. Breadth: While one hidden layer is theoretically sufficient, in practice, deeper
networks (with multiple hidden layers) often perform better. Deeper networks can learn
hierarchical representations of data, where earlier layers learn simpler features, and later
layers combine these simpler features into more abstract and complex ones. This often
leads to better generalization and efficiency.

Real-life Examples of Representation Power

 Image Feature Extraction: An MLP could learn to represent raw pixel values as more
abstract features like edges, corners, and textures in early hidden layers, and then
combine these into object parts in deeper layers, eventually recognizing full objects.
 Language Understanding: In NLP, MLPs can learn to represent words as numerical
vectors (word embeddings) and then combine these representations to understand the
meaning of phrases, sentences, or even entire documents, capturing semantic
relationships.
Sigmoid Neurons

Definition

A Sigmoid Neuron is a type of artificial neuron that uses the sigmoid function (also known as
the logistic function) as its activation function. The sigmoid function squashes any input value
into a range between 0 and 1.

Architecture (within a neuron)

In a sigmoid neuron, the output a is calculated as:

a=σ(z)=σ(i∑wixi+b)

where:

 xi are the inputs

 wi are the weights
 b is the bias
 z is the weighted sum of inputs plus bias
 σ(z)=1+e−z1 is the sigmoid activation function.

Properties and Importance:

 Non-linearity: The sigmoid function introduces non-linearity, which is essential for

neural networks to learn complex, non-linear relationships in data. Without non-linear
activation functions, a multi-layer neural network would simply be equivalent to a single-
layer linear model.
 Smooth Gradient: The sigmoid function is differentiable everywhere, which is crucial
for gradient-based optimization algorithms like gradient descent and backpropagation. Its
derivative is σ′(z)=σ(z)(1−σ(z)).
 Output Range: The output of a sigmoid neuron is always between 0 and 1. This makes it
particularly useful in the output layer for:
o Binary Classification: The output can be interpreted as a probability (e.g.,
probability of belonging to class 1).
o Normalization: Scaling values to a common range.

Numerical Example (Building on FNN Example)

Using the previous single neuron example, with z=0.68:

1. Sigmoid Activation:

a=σ(0.68)=1+e−0.681

a≈0.6637
This output, being between 0 and 1, can be interpreted as a probability if this neuron is in the
output layer of a binary classification model. For example, if we're classifying whether an email
is spam, an output of 0.6637 might mean there's a 66.37% probability it's spam.

Disadvantages:

 Vanishing Gradients: For very large positive or negative values of z, the derivative of
the sigmoid function becomes very small (approaching zero). This can lead to the
"vanishing gradient problem" during backpropagation, where the gradients become so
small that the weights in earlier layers are updated very slowly, hindering learning.
 Not Zero-Centered Output: The output of the sigmoid function is always positive. This
can lead to issues during optimization, as gradients for weights connected to neurons with
positive outputs will always be either all positive or all negative, leading to zig-zagging
optimization paths.

Due to these disadvantages, ReLU (Rectified Linear Unit) and its variants have largely
replaced sigmoid as the activation function of choice for hidden layers in deep networks, though
sigmoid remains popular in output layers for binary classification.

Gradient Descent

Definition

Gradient Descent is an iterative optimization algorithm used to find the minimum of a function.
In the context of deep learning, it's used to minimize the loss function (or cost function) of a
neural network by iteratively adjusting the network's parameters (weights and biases) in the
direction opposite to the gradient of the loss function.

The "gradient" is a vector that points in the direction of the steepest ascent of the function.
Therefore, moving in the opposite direction (down the slope) leads to the minimum.

Types of Gradient Descent

There are primarily three variants of gradient descent, differing in how much data they use to
compute the gradient at each step:

1. Batch Gradient Descent (BGD):

o Definition: Computes the gradient of the loss function with respect to the
parameters for the entire training dataset in each iteration.
o Pros: Guaranteed to converge to the global minimum for convex functions and a
local minimum for non-convex functions. Provides a stable gradient.
o Cons: Computationally expensive for large datasets, as it requires processing all
training examples for each parameter update. Can be slow.
2. Stochastic Gradient Descent (SGD):
oDefinition: Computes the gradient and updates parameters for each individual
training example (or a very small batch of 1) at a time.
o Pros: Much faster than BGD, especially for large datasets. Can escape shallow
local minima in non-convex loss landscapes due to its noisy updates.
o Cons: Updates are noisy, leading to oscillations around the minimum. Requires
careful tuning of the learning rate.
3. Mini-Batch Gradient Descent (MBGD):
o Definition: The most common and practical approach. It computes the gradient
and updates parameters using a small batch of training examples (e.g., 32, 64,
128) in each iteration.
o Pros: Balances the advantages of BGD (stable updates, efficient matrix
operations) and SGD (faster convergence, less prone to getting stuck in local
minima).
o Cons: Requires tuning of the batch size.

Algorithm (High-Level)

The general steps for gradient descent are:

1. Initialize Parameters: Start with random initial weights and biases for the neural
network.
2. Compute Prediction: For a given input, calculate the network's output.
3. Calculate Loss: Compute the error between the predicted output and the actual target
using the loss function.
4. Compute Gradients: Calculate the gradients of the loss function with respect to each
weight and bias in the network (using backpropagation).
5. Update Parameters: Adjust the weights and biases using the following rule:

θnew=θold−learning_rate⋅∇J(θold)

where:

oθ represents a parameter (weight or bias)

olearning_rate is a hyperparameter that controls the step size
o∇J(θ) is the gradient of the loss function J with respect to θ.
6. Repeat: Repeat steps 2-5 for a fixed number of epochs or until the loss converges.

Numerical Example (Simple Linear Regression)

Let's consider a very simple example: finding the minimum of a quadratic function f(x)=x2. We
want to find the value of x that minimizes f(x).

 Function: f(x)=x2
 Derivative (Gradient): f′(x)=2x
 Initial Value: x=5
 Learning Rate (α): 0.1
Iteration 1:

1. Current x=5
2. Gradient at x=5: f′(5)=2⋅5=10
3. Update x: xnew=xold−α⋅f′(xold) xnew=5−0.1⋅10=5−1=4

Iteration 2:

1. Current x=4
2. Gradient at x=4: f′(4)=2⋅4=8
3. Update x: xnew=4−0.1⋅8=4−0.8=3.2

Iteration 3:

1. Current x=3.2
2. Gradient at x=3.2: f′(3.2)=2⋅3.2=6.4
3. Update x: xnew=3.2−0.1⋅6.4=3.2−0.64=2.56

As you can see, x is progressively moving closer to the minimum value of 0. The steps will
become smaller as x approaches 0 because the gradient also approaches 0.

Real-life Example

 Training any Deep Learning Model: Every deep learning model (CNNs, RNNs, FNNs)
uses some form of gradient descent (typically mini-batch gradient descent with advanced
optimizers like Adam, RMSprop) to learn the optimal weights and biases that minimize
the error on the training data. This process allows the model to learn to recognize
patterns, make predictions, or generate content.

Žižka - Text Mining With Machine Learning Principles and Techniques (2020)
No ratings yet
Žižka - Text Mining With Machine Learning Principles and Techniques (2020)
366 pages
CNN Image Classification Guide
No ratings yet
CNN Image Classification Guide
13 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
Unit 1
No ratings yet
Unit 1
70 pages
Deep Learning Day 27
No ratings yet
Deep Learning Day 27
43 pages
Deep Learning Fundamentals
No ratings yet
Deep Learning Fundamentals
19 pages
Deep Learning (DL) - Comprehensive Summary
No ratings yet
Deep Learning (DL) - Comprehensive Summary
9 pages
Deep Learning (Handout)
No ratings yet
Deep Learning (Handout)
11 pages
DL Unit 1
No ratings yet
DL Unit 1
200 pages
NN DL Unit - III
No ratings yet
NN DL Unit - III
19 pages
2630 20230529 Mahdi Momen Aldawood HH 15261 946399124
No ratings yet
2630 20230529 Mahdi Momen Aldawood HH 15261 946399124
11 pages
Deep Learnig
No ratings yet
Deep Learnig
16 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Deep Learning Concise Notes
No ratings yet
Deep Learning Concise Notes
4 pages
Unit-1 Deep Learning
No ratings yet
Unit-1 Deep Learning
71 pages
What Is Deep Learning - SAP
No ratings yet
What Is Deep Learning - SAP
13 pages
Unit 1 Fundamentals of Deep Learning
No ratings yet
Unit 1 Fundamentals of Deep Learning
20 pages
DL Unit 1
No ratings yet
DL Unit 1
199 pages
Four Unit
No ratings yet
Four Unit
3 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
Group I
No ratings yet
Group I
20 pages
ML06 Neural-Network 2024-2025
No ratings yet
ML06 Neural-Network 2024-2025
78 pages
Deep Learning-1
No ratings yet
Deep Learning-1
20 pages
Expanded Deep Learning Document-1
No ratings yet
Expanded Deep Learning Document-1
11 pages
Deep Learning & Neural Networks Guide
100% (1)
Deep Learning & Neural Networks Guide
51 pages
DL Unit I & II
No ratings yet
DL Unit I & II
51 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
DL Unit - I CSD Iv
No ratings yet
DL Unit - I CSD Iv
19 pages
CH 4 Deep Learning
No ratings yet
CH 4 Deep Learning
7 pages
Salman Technical Seminar
No ratings yet
Salman Technical Seminar
24 pages
ML Unit 4
No ratings yet
ML Unit 4
16 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Deep Learning for Tech Enthusiasts
No ratings yet
Deep Learning for Tech Enthusiasts
95 pages
Deep Learning
No ratings yet
Deep Learning
18 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
2 DeepLearning
No ratings yet
2 DeepLearning
46 pages
Deep Learning & Neural Networks Guide
No ratings yet
Deep Learning & Neural Networks Guide
5 pages
Resources ML
No ratings yet
Resources ML
22 pages
Unit 1 Part 1
No ratings yet
Unit 1 Part 1
61 pages
Chapter1. Introduction To Deep Learning
No ratings yet
Chapter1. Introduction To Deep Learning
21 pages
Hardware Architectures For Deep Neural Networks-MIT'16
No ratings yet
Hardware Architectures For Deep Neural Networks-MIT'16
300 pages
Notes From Training
No ratings yet
Notes From Training
12 pages
DL Unit - 1 Notes
No ratings yet
DL Unit - 1 Notes
45 pages
DL - Unit - 1 - Foundations of Deep Learning
No ratings yet
DL - Unit - 1 - Foundations of Deep Learning
35 pages
Lect 2 Common Architectural Principles of Deep Networks
No ratings yet
Lect 2 Common Architectural Principles of Deep Networks
20 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
Deep Learning
No ratings yet
Deep Learning
48 pages
Deep Learning UNIT 5
No ratings yet
Deep Learning UNIT 5
182 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
What Is Deep Learning Basics
No ratings yet
What Is Deep Learning Basics
11 pages
Deep Learning Essentials for Experts
No ratings yet
Deep Learning Essentials for Experts
8 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
24 pages
AIDS Module 4
No ratings yet
AIDS Module 4
29 pages
Introduction To Deep Learning-1
No ratings yet
Introduction To Deep Learning-1
16 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
40 pages
CHAPTER-1 Revisiting AI Project Cycle Questions and Answer
No ratings yet
CHAPTER-1 Revisiting AI Project Cycle Questions and Answer
26 pages
Zero-Day Network Intrusion Detection Using Machine Learning Approach
No ratings yet
Zero-Day Network Intrusion Detection Using Machine Learning Approach
9 pages
Diabetes Project Using Machine Learning
100% (1)
Diabetes Project Using Machine Learning
49 pages
AI - Human Computer Interaction Quiz - June 2024
No ratings yet
AI - Human Computer Interaction Quiz - June 2024
14 pages
2017DataMiningTools PDF
No ratings yet
2017DataMiningTools PDF
4 pages
Preprints202405 1285 v1
No ratings yet
Preprints202405 1285 v1
20 pages
p29-GAN Based Anomaly Detection Review Including Reviewer Suggestions
No ratings yet
p29-GAN Based Anomaly Detection Review Including Reviewer Suggestions
13 pages
Cancer Detection and Analysis Using Machine Learning: Abstract-Among The Various Types of Diseases, Cancer Is
No ratings yet
Cancer Detection and Analysis Using Machine Learning: Abstract-Among The Various Types of Diseases, Cancer Is
5 pages
Introduction To Generative Models
No ratings yet
Introduction To Generative Models
13 pages
Data Science for SpaceX Competitors
No ratings yet
Data Science for SpaceX Competitors
13 pages
Flatten Layer and Pooling Technique
No ratings yet
Flatten Layer and Pooling Technique
3 pages
Ai Robot MM
No ratings yet
Ai Robot MM
20 pages
1 Introduction
No ratings yet
1 Introduction
12 pages
Mathematical Algorithms For Artificial Intelligence and Big Data
No ratings yet
Mathematical Algorithms For Artificial Intelligence and Big Data
34 pages
JD - AI - ML Engineer - AT
No ratings yet
JD - AI - ML Engineer - AT
3 pages
Explainable Artificial Intelligence For Smart Cities 1st Edition Mohamed Lahby PDF Download
No ratings yet
Explainable Artificial Intelligence For Smart Cities 1st Edition Mohamed Lahby PDF Download
71 pages
ML Notes (BCS602)
No ratings yet
ML Notes (BCS602)
186 pages
Crowd Management Main
No ratings yet
Crowd Management Main
33 pages
How To Build A Data Science Portfolio
No ratings yet
How To Build A Data Science Portfolio
17 pages
Assignment 6 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 6 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
10 pages
Sagemaker DG
No ratings yet
Sagemaker DG
3,324 pages
Causal Effect Estimation for Researchers
No ratings yet
Causal Effect Estimation for Researchers
25 pages
Experiment No 4 Vanraj
No ratings yet
Experiment No 4 Vanraj
2 pages
Deep Air Quality Forecasting Using Hybrid Deep
No ratings yet
Deep Air Quality Forecasting Using Hybrid Deep
14 pages
DL Chpter 3
No ratings yet
DL Chpter 3
8 pages
Ibrar Final Synopsis Plagirism Check
No ratings yet
Ibrar Final Synopsis Plagirism Check
13 pages
AI-enabled Organoids Construction, Analysis, and Application
No ratings yet
AI-enabled Organoids Construction, Analysis, and Application
24 pages
Home Energy Management Machine Learning Prediction Algorithms A Review
No ratings yet
Home Energy Management Machine Learning Prediction Algorithms A Review
8 pages

Unit.1.Introduction To Deep Learning

Uploaded by

Unit.1.Introduction To Deep Learning

Uploaded by

This looks like a comprehensive set of topics for a final-year deep learning course!

History of Deep Learning

Deep Learning Success Stories

 Image Recognition and Computer Vision:

Three Classes of Deep Learning Architectures

1. Feedforward Neural Networks (FNNs) / Multilayer Perceptrons (MLPs):

Class of DL Data Type Key Feature Common Applications

Basic Terminologies of Deep Learning

Feedforward Neural Network (FNN)

A Feedforward Neural Network (FNN), also known as a Multilayer Perceptron (MLP), is

An FNN consists of:

Numerical Example (Single Neuron)

 Inputs: x1=0.5, x2=0.8

1. Weighted Sum (z):

2. Activation (Output a):

So, the output of this neuron would be approximately 0.6637.

Representation Power of Feedforward Neural Network (and Multilayer

Universal Approximation Theorem

The Universal Approximation Theorem is a fundamental concept illustrating the power of

Real-life Examples of Representation Power

Architecture (within a neuron)

In a sigmoid neuron, the output a is calculated as:

 xi are the inputs

Properties and Importance:

 Non-linearity: The sigmoid function introduces non-linearity, which is essential for

Numerical Example (Building on FNN Example)

Using the previous single neuron example, with z=0.68:

Types of Gradient Descent

1. Batch Gradient Descent (BGD):

The general steps for gradient descent are:

oθ represents a parameter (weight or bias)

Numerical Example (Simple Linear Regression)

You might also like