0% found this document useful (0 votes)

13 views

15-NEURAL-NETWORK-UPDATED

Artificial Neural Networks (ANNs) are data processing systems inspired by the human brain, consisting of interconnected processing elements that learn relationships between inputs and outputs. They utilize various learning types, including supervised, unsupervised, and reinforcement learning, to acquire knowledge from their environment. The document also discusses the structure, design, and activation functions of ANNs, as well as examples of simple neural network operations like Boolean logic functions.

Uploaded by

nihar.dhurde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

15-NEURAL-NETWORK-UPDATED

Uploaded by

nihar.dhurde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 85

Artificial Intelligence

Neural Network
The idea of ANNs..?
▪ ANNs learn relationship between cause and effect or organize large volumes of data
into orderly and informative patterns.

What is that?
frog

lion

bird
It’s a frog
Artificial Neural Network
▪ ANN Definition:
“Data processing system consisting of a large number of simple, highly interconnected
processing elements (artificial neurons) in an architecture inspired by the structure of the
cerebral cortex of the brain”

(Tsoukalas & Uhrig, 1997)

▪ Neural network: information processing paradigm inspired by biological nervous

systems, such as our brain
▪ Structure: large number of highly interconnected processing elements (neurons)
working together
▪ Like people, they learn from experience (by example)
Artificial Neural Networks
▪ Computational models inspired by the human brain:
▪ Algorithms that try to mimic the brain.

▪ Massively parallel, distributed system, made up of simple processing units (neurons)

▪ Synaptic connection strengths among neurons are used to store the acquired
knowledge.

▪ Knowledge is acquired by the network from its environment through a learning

process
Biological Inspirations
Biological Neuron
Dendrites: nerve fibres carrying electrical Soma (Cell body): computes a non-
signals to the cell linear function of its inputs

Synapse: the point of contact between

the axon of one cell and the dendrite of
Axon: single long fiber that carries the
another, regulating a chemical connection
electrical signal from the cell body to
whose strength affects the input to the
other neurons
cell.
𝑦 = 𝑤𝑥 + 𝑏
Effect of 𝑤 Effect of 𝑏
Computer neuron to mimic the properties of biological neurons:

x w
𝑤𝑥 + 𝑏 ∫ y
Computer neuron to mimic the properties of biological neurons:

x1
w1
x2 w2 𝑛

wn
෍ 𝑥𝑖 𝑤𝑖 + 𝑏
𝑖=1
∫ y
xn
Perceptron
Output Y ▪ Perceptron
▪ computes the weighted sum of its input
(called its net input)
▪ adds its bias
1 𝑖𝑓 𝑠𝑢𝑚 > 0
𝑌=ቊ
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
▪ passes this value through an activation
b 𝑛
function
𝑠𝑢𝑚 = ෍ 𝑤𝑖 𝑥𝑖 ▪ We say that the neuron “fires” (i.e.
𝑖=1
becomes active) if its output is above
zero.
w1 w2 wn
▪ Bias can be incorporated as another
x1 x2 ... xn weight clamped to a fixed input of +1.0
Input X ▪ This extra free variable (bias) makes the
𝑛
neuron more powerful.
𝑛ⅇ𝑡 = ෍ 𝜔𝑖 𝑥𝑖 + 𝑏
𝑖=1
Perceptron
Output Y ▪ Perceptron
▪ computes the weighted sum of its input
(called its net input)
▪ adds its bias
1 𝑖𝑓 𝑠𝑢𝑚 > 0
𝑌=ቊ
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
▪ passes this value through an activation
𝑛
function
𝑠𝑢𝑚 = ෍ 𝑤𝑖 𝑥𝑖 ▪ We say that the neuron “fires” (i.e.
𝑖=1
w0 becomes active) if its output is above
zero.
w1 w2 wn
X0 ▪ Bias can be incorporated as another
x1 x2 ... xn weight clamped to a fixed input of +1.0
Input X ▪ This extra free variable (bias) makes the
𝑛
neuron more powerful.
𝑛ⅇ𝑡 = ෍ 𝜔𝑖 𝑥𝑖
𝑖=0
Inverter
input output
X1 Y 0.5
0 1 F(x) y

1 0 x0 w0= −1
Boolean OR
x2
input input output
x1 x2 Y + +
1
OR
0 0 0
𝑤1=1
𝑥1 𝑓(𝑥) 𝑦
0 1 1 x1
− +
1 0 1
𝑥2
1 1 1
Boolean AND

input input x2
ouput
x1 x2 1
0 0 0 − +
𝑤1=1
𝑥1 𝑓(𝑥) 𝑦
0 1 0
AND
1 0 0
x1
𝑥2 − −
1 1 1
Boolean XOR

input input Output x2

x1 x2 Y

0 0 0
+ −
0 1 1
XOR
1 0 1
x1 x2 x1
1 1 0 − +
Boolean XOR
XOR
input input
ouput −0.5 x2
x1 x2 o

0 0 0 1 −1
+ −
0 1 1 OR AND
1 0 1 h1 h2 −1.5 XOR

1 1 0 w10= −.5 x1
− +
1 1
1 1 1
w20= −1.5

x1 x1
Exercise
▪ Design neural network:
▪ XNOR
▪ NOR
▪ NAND
▪ XOR using OR, AND and NAND
▪ XNOR using OR, AND and NOR
ANN Desing
ANN
Activation
Interconnections Learning
Functions

Feed Learning Learning

Convolution Recurrent Binary Step Linear Non-Linear
Forward Rules Types

Parameter
Single Layer Single Layer Supervised Sigmoid
Learning

Structure
Multilayer Multilayer Unsupervised Softmax
Learning

Reinforcement TanH

ReLU

Leaky ReLU
Learning Types
▪ Supervised learning: (Labeled examples)
▪ Agent is given correct answers for each example
▪ Agent is learning a function from examples of its inputs
and outputs

▪ Unsupervised learning: (Unlabeled examples)

▪ Agent must infer correct answers
▪ Completely unsupervised learning is impractical, since
agent has no context

▪ Reinforcement learning: (Rewards)

▪ Agent is given occasional rewards for correct
▪ Typically involves subproblem of learning “how the world
works”
Learning in ANN
ANN

Learning

Learning Rules Learning Types

Neural
Parameter Structure
Supervised 𝑋 Network 𝑌෠
Learning Learning (Predicted output)
(Input)
𝑾, 𝒃
Connection Change in Update
weights are network Unsupervised learning
updated structure Loss / Error
Parameters Generator 𝑌
𝑾, 𝒃 w. r. t ∆ 𝑌 − 𝑌෠ (Desired Output)
Reinforcement
of Loss /Error
Activation Function
ANN
Activation
Functions

Binary Step Linear Non-Linear

Sigmoid

Softmax

TanH

ReLU

Leaky ReLU
Activation Function
ANN
Activation
Functions

Binary Step Linear Non-Linear

Sigmoid

Softmax

TanH

ReLU

Leaky ReLU
Characterization of ANN
ANN
Interconnections

Feed
Convolution Recurrent
Forward

Single Layer

Multilayer
Neuron Desing
𝑏

𝑤
𝑥 ∑∫ 𝑦 𝑦 = 𝜎(𝑥𝑤 + 𝑏)

𝑥 = 1.4, 1 1
𝑧 = 2.5 ∗ 1.4 + 1 𝑦= 𝑦=
𝑤 = 2.5, 𝑧 = 4.5 1 + 𝑒 −𝑧 1 + 𝑒 (2.5 ∗1.4+1)
𝑏=1
𝑦 = 0.989 𝑦 = 0.989
Neuron Desing
𝑏
𝑥0 𝑤0 𝑛
𝑤1
𝑥1 𝑦 𝑦 = 𝜎 ෍ 𝑥𝑖 𝑤𝑖 + 𝑏
𝑤𝑛 𝑖=0

𝑥𝑛

𝐼𝑛𝑝𝑢𝑡 = 𝒙(1×𝑛) = [𝑥0 , 𝑥1 , ⋯ , 𝑥𝑛 ] 𝑧 = (𝑥0 𝑤0 + 𝑥1 𝑤1 + ⋯ + 𝑥𝑛 𝑤𝑛 + 𝑏)

𝑊𝑒𝑖𝑔ℎ𝑡𝑠 = 𝒘(1×𝑛) = [𝑤0 , 𝑤1 , ⋯ , 𝑤𝑛 ] 𝑧 = 𝒙 ∙ 𝒘𝑇 + 𝑏

𝑏𝑖𝑎𝑠 = 𝒃(1×1) = [𝑏0 ] 𝑦=𝜎 𝑧

Neuron Desing
𝑏
𝑥0 𝑤0
2
𝑤1
𝑥1 𝑦 𝑦 = 𝜎 ෍ 𝑥𝑖 𝑤𝑖 + 𝑏
𝑤2
𝑖=0

𝑥2

𝒙 = 1.4, 0.5, 2.2

𝒘 = 2.5, , 1.0, 1.1

𝒃 = [1]
Single Layer Feedforward Neural Network Desing

𝑏0
2
𝑤0,0 𝑦0 = 𝜎 ෍ 𝑥𝑖 𝑤0,𝑖 + 𝑏0
𝑥0 𝐻0 𝑦0
𝑏1 𝑖=0
2

𝑥1 𝐻1 𝑦1 𝑦1 = 𝜎 ෍ 𝑥𝑖 𝑤1,𝑖 + 𝑏1
𝑏2 𝑖=0
2
𝑥2 𝐻2 𝑦2 𝑦2 = 𝜎 ෍ 𝑥𝑖 𝑤2,𝑖 + 𝑏2
𝑖=0
Single Layer Feedforward Neural Network Desing
𝑏0

𝑤0,0
𝑥0 𝐻0 𝑦0
𝑏1

𝑥1 𝐻1 𝑦1 𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑦(1×3)
𝑏2

𝑥2 𝐻2 𝑦2

𝑥0 × 𝑤0,0 + 𝑥1 × 𝑤0,1 + 𝑥2 × 𝑤0,2 + 𝑏0 ,

𝐼𝑛𝑝𝑢𝑡 = 𝒙(1×𝑛) = [𝑥0 , 𝑥1 , 𝑥2 ]
𝒛 = 𝑥0 × 𝑤1,0 + 𝑥1 × 𝑤1,1 + 𝑥2 × 𝑤1,2 + 𝑏1,
𝑤0,0 𝑤0,1 𝑤0,2 𝑥0 × 𝑤2,0 + 𝑥1 × 𝑤2,1 + 𝑥2 × 𝑤2,2 + 𝑏2
𝑊𝑒𝑖𝑔ℎ𝑡𝑠 = 𝒘(3×𝑛) = 𝑤1,0 𝑤1,1 𝑤1,2
𝑤2,0 𝑤2,1 𝑤2,2 𝒛 = 𝒙 ∙ 𝒘𝑇 + 𝒃

𝑏𝑖𝑎𝑠 = 𝒃(1×3) = [𝑏0 , 𝑏1 , 𝑏2 ] 𝑦 = 𝜎(𝒛)

Single Layer Feedforward Neuron Network Desing
𝑏0

𝑤0,0
𝑥0 𝐻0 𝑦0
𝑏1
𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑌(3×3)
𝑥1 𝐻1 𝑦1
𝑏2
𝒁 = 𝑿 ∙ 𝑾𝑇 + 𝒃
𝑥2 𝐻2 𝑦2
𝑥0,0 × 𝑤0,0 + 𝑥0,1 × 𝑤0,1 + 𝑥0,2 × 𝑤0,2 + 𝑏0 ,
𝑥0,0 𝑥0,1 𝑥0,2 𝒛0 = 𝑥0,0 × 𝑤1,0 + 𝑥0,1 × 𝑤1,1 + 𝑥0,2 × 𝑤1,2 + 𝑏1 ,
𝑥0,0 × 𝑤2,0 + 𝑥0,1 × 𝑤2,1 + 𝑥0,2 × 𝑤2,2 + 𝑏2
𝐼𝑛𝑝𝑢𝑡 = 𝑿(3×𝑛) = 𝑥1,0 𝑥1,1 𝑥1,2 1×3
𝑥2,0 𝑥2,1 𝑥2,2
𝒛0
𝑤0,0 𝑤0,1 𝑤0,2 𝒁 = 𝒛1
𝑊𝑒𝑖𝑔ℎ𝑡𝑠 = 𝑾(3×𝑛) = 𝑤1,0 𝑤1,1 𝑤1,2 𝒛2 3×3
𝑤2,0 𝑤2,1 𝑤2,2
𝑦 = 𝜎(𝒁)
𝑏𝑖𝑎𝑠 = 𝒃(1×3) = [𝑏0 , 𝑏1 , 𝑏2 ]
Multi-Layer Feedforward Neural Network Desing
𝑏0
𝑤0,0 𝑏0
𝑥0 𝐻1,0
𝑏1 𝐻2,0 𝑦0
𝑏1
𝑥1 𝐻1,1
𝑏2 𝐻2,1 𝑦1

𝑥2 𝐻1,2

[𝑿](10×3) [𝑾1 ](3×3) [𝑾2 ](2×3) 𝒀(10×2)

[𝒃1 ](1×3) [𝒃2 ](1×2)
Multi-Layer Neural Network Feedforward Desing
𝑏0
𝑤0,0 𝑏0
𝑥0 𝐻1,0
𝑏1 𝐻2,0 𝑦0
𝑏1
𝑥1 𝐻1,1
𝑏2 𝐻2,1 𝑦1

𝑥2 𝐻1,2

[𝑿](10×3) [𝑾1 ](3×3) [𝑾2 ](2×3) 𝒀(10×2)

[𝒃1 ](1×3) [𝒃2 ](1×2)

𝑌෠ = 𝜎(𝜎(𝑋 ∙ 𝑊1 + 𝑏1 ) ∙ 𝑊2 + 𝑏2 )
Multi-Layer Feedforward Neural Network Desing
𝑏1,0
𝑤0,0 𝑏2,0
𝑥0 𝐻1,0
𝑏1,1 𝑏3,0
𝐻2,0
𝑏2,1 𝐻3,0 𝑦0
𝑥1 𝐻1,1
𝑏1,2 𝐻2,1

𝑥2 𝐻1,2
[𝑿](10×3) [𝑾1 ](3×3) [𝑾2 ](2×3) [𝑾3 ](1×2) 𝒀(10×1)
[𝒃1 ](1×3) [𝒃2 ](1×2) [𝒃3 ](1×1)

𝑌෠ = 𝜎(𝜎(𝜎(𝑋 ∙ 𝑊1 + 𝑏1 ) ∙ 𝑊2 + 𝑏2 ) ∙ 𝑊3 + 𝑏3 )
Multilayer Neural Networks
▪ A multi-layer feedforward neural network with ≥ 1 hidden layers.

Out put Signals

Input Signals

First Second
Input hidden hidden Output
layer layer layer layer
Roles of Layers
▪ Input Layer
▪ Accepts input signals from outside world
▪ Distributes the signals to neurons in hidden
layer

Out put Signals

Input Signals
▪ Usually does not do any computation
▪ Output Layer (computational neurons)
▪ Accepts output signals from the previous
hidden layer
▪ Outputs to the world First Second
Input hidden hidden Output
▪ Knows the desired outputs layer layer layer layer
▪ Hidden Layer (computational neurons)
▪ Determines its own desired outputs
Roles of Layers
▪ Neurons in hidden layers unobservable
through input and output of the
networks.
▪ Desired output unknown (hidden) from

Out put Signals

Input Signals
the outside and determined by the layer
itself
▪ 1 hidden layer for continuous functions
▪ 2 hidden layers for discontinuous First Second
functions Input hidden hidden Output
layer layer layer layer
▪ Practical applications mostly use 3 layers
▪ More layers are possible, but each
additional layer exponentially increases
computing load
Supervise Learning
Supervised Learning
Error / Loss Function
Mean Square Error / Loss (MSE)
𝑛
1 2
𝐿𝑖 = ෍ 𝒚𝑖 − 𝑦ො𝑖
𝑏1,0 𝑛
𝑖=0
𝑤0,0 𝑏2,0 𝑦
𝑥0 𝐻1,0
𝑏1,1 𝑏3,0
𝐻2,0
𝑏2,1 𝐻3,0 𝑦ො (𝑦 − 𝑦)
ො 𝐿
𝑥1 𝐻1,1
𝑏1,2 𝐻2,1

𝑥2 𝐻1,2 Update the Learning Parameter w.r.t. ∆𝐿

Single Neuron
𝑏 𝑦

𝑤
𝑥 𝑥𝑤 + 𝑏 𝑦ො (𝑦 − 𝑦)
ො 𝐿 Loss (L)

𝜕𝐿
▪ Find the derivative of 𝐿 w.r.t. 𝑤 i.e.
𝜕𝑤

▪ If 𝑤 changes by a small amount, how

much will 𝐿 change?
▪ Update the 𝑤 to reduce the Loss 𝐿
▪ 𝑤 = 𝑤 − 𝛼 ⋅ ∆𝑤
Optimization

Loss (L)

Lossmin (L)
Backpropagation
▪ “backward propagation of errors”
▪ Method used in ANN to calculate a gradient of a loss 𝐿 w.r.t.
weights 𝑤 that is needed in the calculation of updated weights.
▪ Computing the gradient one layer at a time, iterating backward
from the last layer to avoid redundant calculations of
intermediate terms in the chain rule.
▪ Used to train ANN by minimizing the difference between
predicted 𝑌෠ and actual outputs 𝑌.
Input signals
1
P=1
x1 1 y1
1
2
x2 2 y2
2

i wij j wjk
xi k yk

m
n l yl
xn
Input Hidden Output
layer layer layer

Error signals
Input signals
1
P=1
x1 Weights trained Weights trained
1 y1
1
2
x2 2 y2
2

i wij j wjk
xi k yk

m
n l yl
xn
Input Hidden Output
layer layer layer

Error signals
Derivatives Rule
Derivative Rule
Chain Rule
Single Neuron
𝑏

𝑤
𝑥 𝑥𝑤 + 𝑏 𝑦ො

3
𝑥
6 𝑅𝑒𝐿𝑈
∗
2 +
7 7
∫
𝑤
1
𝑏
Single Neuron
𝑏

𝑤
𝑥 𝑥𝑤 + 𝑏 𝑦ො

3
𝑥
∗
6
2 +
7
𝑤
1
𝑏
Single Neuron
𝑓 𝑥, 𝑤, 𝑏 = 𝑥 ⋅ 𝑤 + 𝑏 3
𝑥
𝑚6
∗ 𝑓7
1
1. Forward Pass: 2 +
𝑤 1
• 𝑚 =𝑥⋅𝑤 𝜕𝑓
• 𝑓 =𝑚+𝑏 1
𝑏 𝜕𝑚 𝜕𝑓
1
2. Backward Pass: 𝜕𝑓
𝜕𝑓
• 𝜕𝑓
𝜕𝑥 𝜕𝑓 𝜕(𝑚+𝑏)
𝜕𝑓 𝜕𝑏 = =1
• 𝜕𝑚 𝜕𝑚
𝜕𝑤
𝜕𝑓 𝜕𝑓 𝜕(𝑚+𝑏)
• 𝜕𝑏
=
𝜕𝑏
=1
𝜕𝑏
Single Neuron
𝑓 𝑥, 𝑤, 𝑏 = 𝑥 ⋅ 𝑤 + 𝑏 3
𝑓 𝑥, 𝑤, 𝑏 = 𝑓(𝑠𝑢𝑚 𝑚𝑢𝑙 𝑥, 𝑤 , 𝑏 ) 𝑥
𝑚6
∗ 𝑓7
1
2 +
𝑤 1
1. Forward Pass: 3 = 3*1
• 𝑚 =𝑥⋅𝑤 1
• 𝑓 =𝑚+𝑏 𝑏
1
2. Backward Pass: 𝜕𝑓 𝜕𝑚 𝜕𝑓 𝜕𝑚 𝜕(𝑥∙𝑤)
𝜕𝑓 = = =𝑥=3
• 𝜕𝑤 𝜕𝑤 𝜕𝑚 𝜕𝑤 𝜕𝑤
𝜕𝑥
𝜕𝑓
•
𝜕𝑤
𝜕𝑓 Downstream Local Upstream
• Gradient Gradient Gradient
𝜕𝑏
Single Neuron
𝑓 𝑥, 𝑤, 𝑏 = 𝑥 ⋅ 𝑤 + 𝑏 3
𝑓 𝑥, 𝑤, 𝑏 = 𝑓(𝑠𝑢𝑚 𝑚𝑢𝑙 𝑥, 𝑤 , 𝑏 ) 𝑥
2=2∗1 𝑚6
∗ 𝑓7
1
2 +
𝑤 1
1. Forward Pass: 3
• 𝑚 =𝑥⋅𝑤 1
• 𝑓 =𝑚+𝑏 𝑏
1
2. Backward Pass: 𝜕𝑓 𝜕𝑚 𝜕𝑓 𝜕𝑚 𝜕(𝑥∙𝑤)
𝜕𝑓 = = =𝑤=2
• 𝜕𝑥 𝜕𝑥 𝜕𝑚 𝜕𝑥 𝜕𝑥
𝜕𝑥
𝜕𝑓
•
𝜕𝑤
𝜕𝑓 Downstream Local Upstream
• Gradient Gradient Gradient
𝜕𝑏
Single Neuron Gradient
Single Neuron
𝑏

𝑤
𝑥 𝑥𝑤 + 𝑏 𝑦ො
𝜕 max 𝑥, 0
= 1 𝑖𝑓(𝑥 > 0)
𝜕𝑥
3
𝑥
2 6 𝑅𝑒𝐿𝑈
∗
2 1 +
7 7
∫
𝑤 1 1
3
1⋅1 1⋅1
1
𝑏
1 Downstream Local Upstream
Gradient Gradient Gradient
Single Neuron
𝑏

𝑤
𝑥 𝑥𝑤 + 𝑏 𝑦ො 𝜕𝜎 𝑥
= (1 − 𝜎 𝑥 ) 𝜎 𝑥
𝜕𝑥
𝜕𝜎 𝑥
= 1 − 0.99 0.99
𝜕𝑥
3
𝑥
0.01 = 2 ∗ 0.0099 6 𝜎
∗
0.0099 7 0.99
2 + ∫
𝑤 0.02 = 3 ∗ 0.0099
0.0099
1
1 ⋅ 0.0099 0.0099 ⋅ 1
1
𝑏 0.0099 Downstream Local Upstream
Gradient Gradient Gradient
Example
Example
Pattern in Gradient Flow
Vector Derivatives
Backpropagation with Vectors
Backpropogation with Vectors
Backpropagation with Metrics
Reverse-Mode Automatic Differentiation
Backprop: Higher-Order Derivatives
Loss Function and Their Local Gradient
𝑛
Mean Square 1 2
𝜕𝐿𝑖 2
𝐿𝑖 = ෍ 𝒚𝑖 − 𝑦ො𝑖 = − 𝒚𝑖 − 𝑦ො𝑖
Error 𝑛 𝜕𝑦ො 𝑛
𝑖=0

Binary Cross 𝜕𝐿𝑖 𝑦𝑖 1 − 𝑦𝑖

𝐿𝑖 = −𝑦𝑖 ⋅ log(𝑦෢
𝑖 ) − 1 − 𝑦 ⋅ log(1 − 𝑦
ෝ𝑖 ) =− −
Entropy 𝜕𝑦ො 𝑦ො𝑖 1 − 𝑦ෝ𝑖

Categorical 𝜕𝐿𝑖 𝑦𝑖
𝐿𝑖 = − ෍ 𝑦𝑖 ⋅ log(𝑦ෝ𝑖 ) =−
Cross Entropy 𝑖
𝜕𝑦ො 𝑦ො𝑖
Back-propagation Algorithm
▪ In a back-propagation neural network, the learning algorithm has 2 phases.
1. Forward propagation of inputs
2. Backward propagation of errors
▪ The algorithm loops over the 2 phases until the errors obtained are lower than a
certain threshold.
▪ Learning is done in a similar manner as in a perceptron
▪ A set of training inputs is presented to the network.
▪ The network computes the outputs.
▪ The weights are adjusted to reduce errors.
▪ The activation function used is a sigmoid function.

1
=
sigmoid
Y 1 +e − X
Improving the Perceptron
Probabilistic Classification
▪ Naïve Bayes provides probabilistic classification

1: 0.001
1: 0.001 2: 0.703
2: 0.001 …
… 6: 0.264
0: 0.991 …
0: 0.001

▪ Perceptron just gives us a class prediction

▪ Can we get it to give us probabilities?
▪ Turns out it also makes it easier to train!

Note: I’m going to be lazy and use “x” in place of “f(x)” here – this is just for notational convenience!
A Probabilistic Perceptron
A 1D Example

definitely blue not sure definitely red

probability increases exponentially

as we move away from boundary

normalizer
The Soft Max
How to Learn?
▪ Maximum likelihood estimation

▪ Maximum conditional likelihood estimation

Local Search
o Simple, general idea:
o Start wherever
o Repeat: move to the best neighboring state
o If no neighbors better than current, quit
o Neighbors = small perturbations of w
Our Status
o Our objective

o Challenge: how to find a good w ?

o Equivalently:
1D optimization

o Could evaluate and

o Then step in best direction

o Or, evaluate derivative:

o Which tells which direction to step into

2-D Optimization

Source: Thomas Jungblut’s Blog

Steepest Descent
o Idea:
o Start somewhere
o Repeat: Take a step in the steepest descent direction

Figure source: Mathworks

Steepest Direction
o Steepest Direction = direction of the gradient
How to Learn?
Optimization Procedure: Gradient Descent
Stochastic Gradient Descent

compare this to the

multiclass perceptron:
probabilistic weighting!
probability of incorrect answer probability of incorrect answer
Logistic Regression Demo!

https://playground.tensorflow.org/

(Ebook) Generative Deep Learning by David Foster ISBN 9781492041948, 1492041947 - The ebook is available for instant download, no waiting required
100% (1)
(Ebook) Generative Deep Learning by David Foster ISBN 9781492041948, 1492041947 - The ebook is available for instant download, no waiting required
54 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
12 Neural Network
No ratings yet
12 Neural Network
52 pages
Week 8 - ANN
No ratings yet
Week 8 - ANN
42 pages
Machine Learning
No ratings yet
Machine Learning
77 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
L6 Neural Network
No ratings yet
L6 Neural Network
57 pages
Unit 3 - Ann
No ratings yet
Unit 3 - Ann
49 pages
Assign 1 Soft Comp
No ratings yet
Assign 1 Soft Comp
12 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
Neural Network: Throughout The Whole Network, Rather Than at Specific Locations
No ratings yet
Neural Network: Throughout The Whole Network, Rather Than at Specific Locations
8 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
71 pages
II. Artificial Neural Networks
No ratings yet
II. Artificial Neural Networks
73 pages
Artificial Neural Networks: System That Can Acquire, Store, and Utilize Experiential Knowledge
100% (1)
Artificial Neural Networks: System That Can Acquire, Store, and Utilize Experiential Knowledge
40 pages
UNIT III 3.1 ML Artificial Neural Networks
No ratings yet
UNIT III 3.1 ML Artificial Neural Networks
65 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Part7.2 Artificial Neural Networks
No ratings yet
Part7.2 Artificial Neural Networks
51 pages
ML Lecture#4
No ratings yet
ML Lecture#4
109 pages
Lecture-2 Learning Process45452465442
No ratings yet
Lecture-2 Learning Process45452465442
50 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
34 pages
Basics
No ratings yet
Basics
48 pages
ML-U2
No ratings yet
ML-U2
15 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
ANN PG Module1
No ratings yet
ANN PG Module1
75 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
NN Lecture1 Introduction
No ratings yet
NN Lecture1 Introduction
40 pages
Isch 4
No ratings yet
Isch 4
44 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Module - 3 AAI
No ratings yet
Module - 3 AAI
119 pages
2024 MTH058 Lecture02 Backpropagation
No ratings yet
2024 MTH058 Lecture02 Backpropagation
62 pages
2023-Lecture11-NeuralNetworks
No ratings yet
2023-Lecture11-NeuralNetworks
48 pages
Chapter 3-1 Neural Network
No ratings yet
Chapter 3-1 Neural Network
43 pages
7 Neural Networks - Lecture Slides
No ratings yet
7 Neural Networks - Lecture Slides
74 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Neural Networks
No ratings yet
Neural Networks
36 pages
12. NN Introduction MES
No ratings yet
12. NN Introduction MES
39 pages
Module 3 Ppt
No ratings yet
Module 3 Ppt
83 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
Unit III
No ratings yet
Unit III
37 pages
Lecture 9
No ratings yet
Lecture 9
97 pages
Chapter 5 Artificial Neural Networks
No ratings yet
Chapter 5 Artificial Neural Networks
50 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
44 pages
Unit V Tn321
No ratings yet
Unit V Tn321
50 pages
2021 Lecture11 NeuralNetworks
No ratings yet
2021 Lecture11 NeuralNetworks
48 pages
Lesson 7.0 Supervised Learning With Neural Networks (1)
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks (1)
22 pages
Neural-Network(Basics)
No ratings yet
Neural-Network(Basics)
48 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Notes_ML_02_Slides_RNN_ANN
No ratings yet
Notes_ML_02_Slides_RNN_ANN
105 pages
Neural NetworksChapter2Sup
No ratings yet
Neural NetworksChapter2Sup
20 pages
WAP To Implement Artificial Neural Network
No ratings yet
WAP To Implement Artificial Neural Network
13 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
48 pages
Training Neural Networks With GA Hybrid Algorithms
No ratings yet
Training Neural Networks With GA Hybrid Algorithms
12 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
Lecture Slides-Week13,14
No ratings yet
Lecture Slides-Week13,14
62 pages
Artificial Neural Network
100% (2)
Artificial Neural Network
20 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Assignment 6
No ratings yet
Assignment 6
4 pages
107106088-514-524
No ratings yet
107106088-514-524
11 pages
Nozzles_1
No ratings yet
Nozzles_1
23 pages
107106088-589-597
No ratings yet
107106088-589-597
9 pages
107106088-495-504
No ratings yet
107106088-495-504
10 pages
s12155-013-9388-2
No ratings yet
s12155-013-9388-2
8 pages
9-Predicate Logic v2
No ratings yet
9-Predicate Logic v2
68 pages
Prediction-of-Equations-for-Higher-Heating-Values-of-Biomass-Using-Proximate-and-Ultimate-Analysis
No ratings yet
Prediction-of-Equations-for-Higher-Heating-Values-of-Biomass-Using-Proximate-and-Ultimate-Analysis
6 pages
10-PPRODUCTION BASED SYSTEM
No ratings yet
10-PPRODUCTION BASED SYSTEM
51 pages
3-UNINFORMED-SEARCH
No ratings yet
3-UNINFORMED-SEARCH
70 pages
6-GAME
No ratings yet
6-GAME
53 pages
Presentation_3[1]
No ratings yet
Presentation_3[1]
56 pages
1-INTRODUCTION
No ratings yet
1-INTRODUCTION
77 pages
Artistic Intelligence at Kunstverein Hannover
No ratings yet
Artistic Intelligence at Kunstverein Hannover
4 pages
Lec03 NeuralNetwork
No ratings yet
Lec03 NeuralNetwork
39 pages
The Evolution of AI
No ratings yet
The Evolution of AI
20 pages
Deep Learning Techniques For Geospatial Data Analysis: August 2020
No ratings yet
Deep Learning Techniques For Geospatial Data Analysis: August 2020
21 pages
AI2122
No ratings yet
AI2122
1 page
Regression - Classification:: When Is Categorical
No ratings yet
Regression - Classification:: When Is Categorical
1 page
JD For AI (Intern)
No ratings yet
JD For AI (Intern)
2 pages
Hands On Text Analytics With Orange - Digital Humanities 2017
No ratings yet
Hands On Text Analytics With Orange - Digital Humanities 2017
2 pages
Naïve Baye's Classification
No ratings yet
Naïve Baye's Classification
16 pages
Important Questions Unit 2
No ratings yet
Important Questions Unit 2
8 pages
Yolo
No ratings yet
Yolo
13 pages
The Multilayer Perceptron
No ratings yet
The Multilayer Perceptron
11 pages
Kecerdasan Buatan: Artificial Intelligence
No ratings yet
Kecerdasan Buatan: Artificial Intelligence
30 pages
Unit 6 Application of AI
No ratings yet
Unit 6 Application of AI
91 pages
4-Recurrent Neural Network
No ratings yet
4-Recurrent Neural Network
21 pages
AI Unit 2
No ratings yet
AI Unit 2
38 pages
Python TensorFlow Tutorial - Build A Neural Network - Adventures in Machine Learning
No ratings yet
Python TensorFlow Tutorial - Build A Neural Network - Adventures in Machine Learning
18 pages
ML Course Project
No ratings yet
ML Course Project
13 pages
Reading 3 Machine Learning - Answers
No ratings yet
Reading 3 Machine Learning - Answers
11 pages
breast cancer classification
No ratings yet
breast cancer classification
3 pages
CS L04 MachineLearning Basics 02
No ratings yet
CS L04 MachineLearning Basics 02
64 pages
Tutorial Backpropagation Neural Network
No ratings yet
Tutorial Backpropagation Neural Network
10 pages
SpeechToSpeech 1
No ratings yet
SpeechToSpeech 1
30 pages
Introduction To Machine Learning: Enrique Vinicio Carrera
No ratings yet
Introduction To Machine Learning: Enrique Vinicio Carrera
98 pages
CS229 Final Report - Music Genre Classification
No ratings yet
CS229 Final Report - Music Genre Classification
6 pages
Shortlist 2
No ratings yet
Shortlist 2
1 page
FAI UNIT-II
No ratings yet
FAI UNIT-II
12 pages
AIEdge MLArchive
No ratings yet
AIEdge MLArchive
93 pages
Artificial Intelligence For Game Design
No ratings yet
Artificial Intelligence For Game Design
2 pages

15-NEURAL-NETWORK-UPDATED

Uploaded by

15-NEURAL-NETWORK-UPDATED

Uploaded by

Artificial Intelligence

(Tsoukalas & Uhrig, 1997)

▪ Neural network: information processing paradigm inspired by biological nervous

▪ Massively parallel, distributed system, made up of simple processing units (neurons)

▪ Knowledge is acquired by the network from its environment through a learning

Synapse: the point of contact between

input input Output x2

Feed Learning Learning

▪ Unsupervised learning: (Unlabeled examples)

▪ Reinforcement learning: (Rewards)

Learning Rules Learning Types

Binary Step Linear Non-Linear

Binary Step Linear Non-Linear

𝐼𝑛𝑝𝑢𝑡 = 𝒙(1×𝑛) = [𝑥0 , 𝑥1 , ⋯ , 𝑥𝑛 ] 𝑧 = (𝑥0 𝑤0 + 𝑥1 𝑤1 + ⋯ + 𝑥𝑛 𝑤𝑛 + 𝑏)

𝑊𝑒𝑖𝑔ℎ𝑡𝑠 = 𝒘(1×𝑛) = [𝑤0 , 𝑤1 , ⋯ , 𝑤𝑛 ] 𝑧 = 𝒙 ∙ 𝒘𝑇 + 𝑏

𝑏𝑖𝑎𝑠 = 𝒃(1×1) = [𝑏0 ] 𝑦=𝜎 𝑧

𝒙 = 1.4, 0.5, 2.2

𝒘 = 2.5, , 1.0, 1.1

𝑥0 × 𝑤0,0 + 𝑥1 × 𝑤0,1 + 𝑥2 × 𝑤0,2 + 𝑏0 ,

𝑏𝑖𝑎𝑠 = 𝒃(1×3) = [𝑏0 , 𝑏1 , 𝑏2 ] 𝑦 = 𝜎(𝒛)

[𝑿](10×3) [𝑾1 ](3×3) [𝑾2 ](2×3) 𝒀(10×2)

[𝑿](10×3) [𝑾1 ](3×3) [𝑾2 ](2×3) 𝒀(10×2)

Out put Signals

Out put Signals

Out put Signals

𝑥2 𝐻1,2 Update the Learning Parameter w.r.t. ∆𝐿

▪ If 𝑤 changes by a small amount, how

Binary Cross 𝜕𝐿𝑖 𝑦𝑖 1 − 𝑦𝑖

▪ Perceptron just gives us a class prediction

definitely blue not sure definitely red

probability increases exponentially

▪ Maximum conditional likelihood estimation

o Challenge: how to find a good w ?

o Could evaluate and

o Or, evaluate derivative:

o Which tells which direction to step into

Source: Thomas Jungblut’s Blog

Figure source: Mathworks

compare this to the

You might also like