0% found this document useful (0 votes)

17 views

Lec06 Derivatives

Uploaded by

awaisqarni640

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPSX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Lec06 Derivatives

Uploaded by

awaisqarni640

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPSX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Intuition about Derivatives

• Consider a function f(a) = 3a, as linear thus would be a straight line.

• For a= 2 => f(a) = 6
• For small increment a= 2.001 => f(a) = 6.003
• Implies that increase in a=0.001 => 3 times in f(a)
• Mean, Slop (Derivative) of f(a) at a=2 is: 3
• i.e., derivative = height/width = 0.003/0.001=3
• Consider another point for a =5 => f(a) = 15
• For small increment a= 5.001 => f(a) = 15.003
• Slop (Derivative= df(a)/da) of f(a) at a=5 is still: 3 for a given function.
Intuition about Derivatives
• Consider another function f(a) = a2
• For a= 2 => f(a) = 4
• For small increment a= 2.001 => f(a) = 4.004001
• Mean, Slop (Derivative) of f(a) at a=2 is: 4
• Slop is different for this fun. For different values
• For a= 5 => f(a) = 25
• For a= 5.001 => f(a) = 25.010
• Slop (Derivative= df(a)/da) of f(a) at a=5 is: 10 for this function.
• Slop is consistent for this complex function.
• Means For small increment in a =0.001, f(a) bump up( 2a )x 0.001
Conclusion about Derivatives
• Derivatives : Rate of change. It is just a slope of a line.
• If you want to look up the derivative of a function, open your calculus
textbook
• Often get a formula for the slope of these functions at different points
Computation Graph
• The computations of a neural network are organized in terms of a
forward pass or a forward propagation step
• To compute the output of the neural network,
• Followed by a backward pass or backpropagation step
• To compute gradients or compute derivatives.
• The computation graph explains why it is organized this way.
Computation Graph
• To illustrate it, let's use a simpler example
As an example, if a=5, b=3 and c=2 then

The computation graph helps

optimizing some variable as J in
• We can take these three steps and draw themthis
in case.
a computation graph
as shown graphically.
In the case of a logistic regression, J was the
cost function that we’ve to minimize.
Computation Graph
• The computation graph organizes a computation with these arrows left-to-
right to compute the value of J.

• To compute derivatives there'll be a right-to-left pass, in the opposite

direction of these blue arrows.
• That would be most natural for computing the derivatives.
Derivatives with a Computation
Graph
• Let's say we want to compute the derivative of J with respect to v.

• This says, if we were to take this Terminology

value of v of and change it a little
backpropagation: bit, how
By computing,
would the value of J change? the derivative of this final output variable J with
respect to v, we've done one step of
• Since, J is defined as 3 times v and right now, v = 11.
backpropagation.
• So if we're to bump up v by a little bit to 11.001, then J, which is 3v,
• Current J=33, will get bumped up to 33.003 on increasing v by 0.001.
• So the derivative of J with respect to v is equal to 3.
Derivatives with a Computation
Graph
• What is dJ/da?

• The increase to J is 3 times the increase to a.

• So that means this derivative is equal to 3.
Derivatives with a Computation
Graph
• One way to break this down is to say that if you change a, then that
will change v.
• Through changing v, that would change J.
• First, by changing a, you end up increasing v.
• Well, how much does v increase?
• It is increased by an amount that's determined by dv/da, and then
• The change in v will cause the value of J to also increase.
• In calculus, this is called the chain rule.
Instead of a long name, just
Or Simply:
Derivatives with a Computation
Graph
• So the key from this example, is that when computing derivatives and computing
all of these derivatives, the most efficient way to do so is through a right to left
computation following the direction of the red arrows.

• In particular, we'll first compute the derivative with respect to v. Then that
becomes useful for computing the derivative with respect to a and the derivative
with respect to u. Then the derivative with respect to u, becomes useful for
computing the derivative with respect to b and the derivative with respect to c.
Exercise: Compute
1. dJ/db?
2. dJ/dc? Hint 
Practical Example
Derivatives with a Computation
Graph
We have discussed the computation graph and how does
1. a forward or left to right calculation to compute the cost function such as J
that you might want to optimize, and
2. a backwards or a right to left calculation to compute derivatives.
These concepts will be applied to compute derivatives of the logistic
regression model.
Logistic Regression Gradient
Descent
Recap: We had set up logistic regression as follows:

• Let's write this out as a computation graph and for this example, If we
have only two features, X1 and X2.

In logistic regression, what we want to do is to

modify the parameters, W and b, in order to
reduce this loss.
Logistic Regression Gradient
Descent
• Let's talk about how you can go backwards to compute the
derivatives.

dL/db = dL/dz .dz/db= dz. dz/db= dz (0+0+1)= dz

dL/dw1 = dL/dz .dz/dw1= dz. dz/dw1= dz (x1+0+0)= x1. dz
Logistic Regression Gradient
Descent
Gradient Descent will be computed by updating w1, w2, and b for
logistic regression w.r.t a single training example:

By using the following values:

• dz = (a-y)
• dw1 = x1 (a-y)
• dw2 = x2 (a-y)
• db = (a-y)
Next we consider for m training examples
Logistic Regression Gradient
Descent on m examples

• Recap

We have discussed, How to compute

them for a Single training example as
mentioned with Superscript i.

On a Single training example i

So, compute these derivatives, as we have shown on the training examples, and
average them to give the overall gradient to implement the gradient descent.
Logistic Regression Gradient
Descent on m examples
What we're going to do is use a for loop over the training set, and
• We would start up as: compute the derivative with respect to each training example and
then add them up.
After Computations, dw1 would be
So, with all of these calculations, you've
derivative of just
costcomputed the
fn J w.r.t. w1.
derivatives of the cost function J with respect to each your
parametersFinally,
w_1,tow_2
implement
and b. one step of gradient
descent,
We're using dw_1update the and
and dw_2 learnable
db as parameters
accumulators, so that
after thisas:
computation, dw_1 is equal to the derivative of your
overall cost function with respect to w_1 and similarly for dw_2
and db.

Notice that dw_1 and dw_2 do not have a superscript i,

because we're using them in this code as accumulators to sum
• So,
over the everything
entire training set. on the slide implements just
one single step of gradient descent.
• We have to repeat everything it multiple
times in order to take multiple steps of
gradient descent.
Logistic Regression Gradient
Descent on m examples
Weakness in the previous Implementations:
• To implement logistic regression this way, We need two for loops.
• The first for loop is this for loop over the m training examples, and
• The second for loop is a for loop over all the n features.
• In the deep learning era, we would move to bigger and bigger
datasets, without using explicit for loops.
• Solution:
Vectorization techniques allow us to get rid of these explicit
for-loops.
Vectorization
• Vectorization is the art of getting rid of explicit “for” in your code.
Vectorization
• Next with More Examples

Closeout Report: Tablet Rollout: Project Summary
44% (9)
Closeout Report: Tablet Rollout: Project Summary
3 pages
DeepLearning Introduction
No ratings yet
DeepLearning Introduction
14 pages
Demystifying Deep Learning
No ratings yet
Demystifying Deep Learning
68 pages
Calc
No ratings yet
Calc
6 pages
PDF_1678529419
No ratings yet
PDF_1678529419
100 pages
Aravind Rangamreddy 500195259 cs3
No ratings yet
Aravind Rangamreddy 500195259 cs3
8 pages
Chap3slides
No ratings yet
Chap3slides
95 pages
Neural Network (Perceptrons)
No ratings yet
Neural Network (Perceptrons)
31 pages
07autodiff Nnets
No ratings yet
07autodiff Nnets
12 pages
Deep Learning
100% (4)
Deep Learning
100 pages
Lecture12 Diff
No ratings yet
Lecture12 Diff
31 pages
Tut 01
No ratings yet
Tut 01
39 pages
W2 Ann
No ratings yet
W2 Ann
12 pages
Automatic Differentiation and Neural Networks
No ratings yet
Automatic Differentiation and Neural Networks
13 pages
09: Neural Networks - Learning: Neural Network Cost Function
No ratings yet
09: Neural Networks - Learning: Neural Network Cost Function
9 pages
Machine Learning and Pattern Recognition Week 8 - Backprop
No ratings yet
Machine Learning and Pattern Recognition Week 8 - Backprop
8 pages
Automatic Differentiation (1) : Slides Prepared By: Atılım Güneş Baydin Gunes@robots - Ox.ac - Uk
No ratings yet
Automatic Differentiation (1) : Slides Prepared By: Atılım Güneş Baydin Gunes@robots - Ox.ac - Uk
114 pages
2. Neural Network Training
No ratings yet
2. Neural Network Training
73 pages
ANN-Unit 4 - Logistic & Neural Notation
No ratings yet
ANN-Unit 4 - Logistic & Neural Notation
13 pages
Differentiable Programming and Design Optimization
No ratings yet
Differentiable Programming and Design Optimization
72 pages
Backward Forward Propogation
No ratings yet
Backward Forward Propogation
19 pages
A Step-By-step Introduction to the Implementation of Automatic Differentiation
No ratings yet
A Step-By-step Introduction to the Implementation of Automatic Differentiation
17 pages
CS231n Convolutional Neural Networks For Visual Recognition 4
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition 4
10 pages
CS231n Convolutional Neural Networks For Visual Recognition
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition
9 pages
Neural Networks - Learning
No ratings yet
Neural Networks - Learning
26 pages
BackPropagation
No ratings yet
BackPropagation
10 pages
Lecture NM 1 Numerical Differentiation Integration
No ratings yet
Lecture NM 1 Numerical Differentiation Integration
57 pages
Gradient Based Optimization
No ratings yet
Gradient Based Optimization
8 pages
EE769 7 Introduction To Neural Networks
No ratings yet
EE769 7 Introduction To Neural Networks
52 pages
nn2
No ratings yet
nn2
12 pages
Backpropagation
No ratings yet
Backpropagation
4 pages
Logistic Regression
No ratings yet
Logistic Regression
51 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Fast ai class notes
No ratings yet
Fast ai class notes
48 pages
Multi Percept Ron
No ratings yet
Multi Percept Ron
14 pages
ML Notes
No ratings yet
ML Notes
14 pages
Deep Learning (Part 27)-Backpropagation Intuition _ by Coursesteach _ Medium
No ratings yet
Deep Learning (Part 27)-Backpropagation Intuition _ by Coursesteach _ Medium
9 pages
Chap5 3-BackProp
No ratings yet
Chap5 3-BackProp
41 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
Gradient Descent - Problem of Hiking Down A Mountain: Derivatives
No ratings yet
Gradient Descent - Problem of Hiking Down A Mountain: Derivatives
8 pages
Learning 3
No ratings yet
Learning 3
98 pages
Sms Essay 2
No ratings yet
Sms Essay 2
6 pages
Lecture 2, Part 2: Backpropagation: Roger Grosse
No ratings yet
Lecture 2, Part 2: Backpropagation: Roger Grosse
9 pages
Module2-Optimizations
No ratings yet
Module2-Optimizations
65 pages
Chapter2 PDF
No ratings yet
Chapter2 PDF
6 pages
Backpropagation Exercises
No ratings yet
Backpropagation Exercises
7 pages
Lecture04 Neuralnets
No ratings yet
Lecture04 Neuralnets
81 pages
cs224n 2023 Lecture03 Neuralnets
No ratings yet
cs224n 2023 Lecture03 Neuralnets
83 pages
Autodiff
No ratings yet
Autodiff
12 pages
Slides-4 Optimization Extra Gradient Descent
No ratings yet
Slides-4 Optimization Extra Gradient Descent
67 pages
Digital Unit Plan Template Unit Title: Derivatives Name: Angel Perez Content Area: Mathematics, Calculus Grade Level: 10-12
No ratings yet
Digital Unit Plan Template Unit Title: Derivatives Name: Angel Perez Content Area: Mathematics, Calculus Grade Level: 10-12
3 pages
Week 1 Solutions
No ratings yet
Week 1 Solutions
8 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
Neural Networks: Derivation: 1 Model
No ratings yet
Neural Networks: Derivation: 1 Model
9 pages
Appendix D Calculus
No ratings yet
Appendix D Calculus
31 pages
Neural Networks Optional
No ratings yet
Neural Networks Optional
96 pages
Deep Learning Andrew NG
100% (3)
Deep Learning Andrew NG
173 pages
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
Lecture-3-RC4-A
No ratings yet
Lecture-3-RC4-A
18 pages
Lec03-1 Artificial Neural Networks
No ratings yet
Lec03-1 Artificial Neural Networks
17 pages
Lec09-1Confusion Matrix - A Comprehensive Guide
No ratings yet
Lec09-1Confusion Matrix - A Comprehensive Guide
6 pages
Lec02-Deep Learning Applications
No ratings yet
Lec02-Deep Learning Applications
49 pages
Lec07 1 Vectorization
No ratings yet
Lec07 1 Vectorization
47 pages
Lec08-1Activation Functions
No ratings yet
Lec08-1Activation Functions
19 pages
Lec01.301d-How To DL
No ratings yet
Lec01.301d-How To DL
33 pages
03-Lecture Notes-Mid
No ratings yet
03-Lecture Notes-Mid
23 pages
How To Make Money in The Stock Market
No ratings yet
How To Make Money in The Stock Market
3 pages
Report of The GDS Committee
67% (12)
Report of The GDS Committee
456 pages
Granite
No ratings yet
Granite
4 pages
Water Scarcity
No ratings yet
Water Scarcity
21 pages
Ads Unit III
No ratings yet
Ads Unit III
37 pages
Intel (R) ME Firmware Integrated Clock Control (ICC) Tools User Guide
No ratings yet
Intel (R) ME Firmware Integrated Clock Control (ICC) Tools User Guide
24 pages
Staffing
No ratings yet
Staffing
19 pages
Suit Under Order 37 Rule 1 & 2 (One Cheque)
No ratings yet
Suit Under Order 37 Rule 1 & 2 (One Cheque)
4 pages
Group 7: Art Integrated Project of Mathematics Class-9 - F
No ratings yet
Group 7: Art Integrated Project of Mathematics Class-9 - F
24 pages
LabVIEW Interface For Arduino Setup Procedure
No ratings yet
LabVIEW Interface For Arduino Setup Procedure
15 pages
Topic 1 Chapter 1 Who Is A Leader and What Skills Do Leaders Need
No ratings yet
Topic 1 Chapter 1 Who Is A Leader and What Skills Do Leaders Need
26 pages
Manuals For Lifeboat Systems
No ratings yet
Manuals For Lifeboat Systems
31 pages
Vi - Facilities Physical Plant Evaluation Form
No ratings yet
Vi - Facilities Physical Plant Evaluation Form
20 pages
New Steam Boiler MSB
No ratings yet
New Steam Boiler MSB
1 page
Microsoft Word 2007: Step by Step
No ratings yet
Microsoft Word 2007: Step by Step
89 pages
Action Plan in Reading English
No ratings yet
Action Plan in Reading English
3 pages
Analyzing New Ways To Adapt The Triple-A Supply Chain
No ratings yet
Analyzing New Ways To Adapt The Triple-A Supply Chain
12 pages
Osmena v. COMELEC
No ratings yet
Osmena v. COMELEC
3 pages
GUID Partition Table
No ratings yet
GUID Partition Table
19 pages
Lecture 6 Pseudocode
No ratings yet
Lecture 6 Pseudocode
34 pages
[Ebooks PDF] download Development Aid and Adaptation to Climate Change in Developing Countries 1st Edition Carola Betzold full chapters
100% (1)
[Ebooks PDF] download Development Aid and Adaptation to Climate Change in Developing Countries 1st Edition Carola Betzold full chapters
47 pages
Esop 07
No ratings yet
Esop 07
3 pages
The Answers Key - Tiếng Anh 9 i-Learn Smart World - Luyện tập - Unit 5
No ratings yet
The Answers Key - Tiếng Anh 9 i-Learn Smart World - Luyện tập - Unit 5
2 pages
6 XII-Hospital Management-Pro Documentation
No ratings yet
6 XII-Hospital Management-Pro Documentation
18 pages
Kamera SDC 415
No ratings yet
Kamera SDC 415
6 pages
C.V Sukhpal Singh 1.02.21
No ratings yet
C.V Sukhpal Singh 1.02.21
5 pages
Police Intelligence and Secret Service: Iligan Capitol College
No ratings yet
Police Intelligence and Secret Service: Iligan Capitol College
11 pages
LCD Backlight Inverter Drive IC: Features Description
No ratings yet
LCD Backlight Inverter Drive IC: Features Description
14 pages
6.4 Activity 2
No ratings yet
6.4 Activity 2
2 pages

Lec06 Derivatives

Uploaded by

Lec06 Derivatives

Uploaded by

Intuition about Derivatives

• Consider a function f(a) = 3a, as linear thus would be a straight line.

The computation graph helps

• To compute derivatives there'll be a right-to-left pass, in the opposite

• This says, if we were to take this Terminology

• The increase to J is 3 times the increase to a.

In logistic regression, what we want to do is to

dL/db = dL/dz .dz/db= dz. dz/db= dz (0+0+1)= dz

By using the following values:

We have discussed, How to compute

On a Single training example i

Notice that dw_1 and dw_2 do not have a superscript i,

You might also like