0% found this document useful (0 votes)

63 views47 pages

BMM 2018 - Deep Learning Tutorial

The document provides an overview of deep learning topics including supervised learning with neural networks, convolutional neural networks for object recognition, and recurrent neural networks. It discusses concepts like multilayer perceptrons, backpropagation, convolutional layers, pooling, and the evolution of neural network architectures over time.

Uploaded by

araghunathreddyraghunath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views47 pages

BMM 2018 - Deep Learning Tutorial

Uploaded by

araghunathreddyraghunath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Deep Learning Tutorial

Brains, Minds, and Machines Summer Course 2018

TA: Eugenio Piasini & Yen-Ling Kuo

Roadmap

● Supervised Learning with Neural Nets

● Convolutional Neural Networks for Object Recognition
● Recurrent Neural Network
● Other Deep Learning Models
Supervised Learning
with Neural Nets

General references:
Hertz, Krogh, Palmer 1991
Goodfellow, Bengio, Courville 2016
Supervised learning
Given example input-output pairs (X,Y),
learn to predict output Y from input X

Logistic regression, support vector machines, decision trees, neural networks...

Binary classification:
simple perceptron

(McCulloch & Pitts 1943)

Perceptron learning rule
(Rosenblatt 1962)
g is a nonlinear activation function, in this case
Linear separability

Simple perceptrons can only learn to solve linearly separable problems (Minsky and Papert 1969).
We can solve more complex problems by composing many units in multiple layers.
Multilayer perceptron (MLP)

(“forward propagation”)

MLPs are universal function approximators (Cybenko 1989; Hornik 1989).

(under some assumptions… exercise: show that if g is linear, this architecture reduces to a simple perceptron)
Deep vs shallow
Universality: “shallow” MLPs with one hidden layer can represent any continuous function to arbitrary
precision, given a large enough number of units. But:

● No guarantee that the number of required units is reasonably small (expressivity).

● No guarantee that the desired MLP can actually be found with our chosen learning method
(learnability).

Two motivations for using deep nets instead (see Goodfellow et al 2016, section 6.4.1):

● Statistical: deep nets are compositional, and naturally well suited to representing hierarchical
structures where simpler patterns are composed and reused to form more complex ones
recursively. It can be argued that many interesting structures in real world data are like this.
● Computational: under certain conditions, it can be proved that deep architectures are more
expressive than shallow ones, i.e. they can learn more patterns for a given total size of the network.
Problem: compute all
Key insights: the loss depends
● on the weights w of a unit only through that unit’s

Backpropagation
activation h
● on a unit’s activation h only through the activation of
those units that are downstream from h.
(Rumelhart, Hinton, Williams 1986)

The “errors” being

backpropagated

These give the gradient of the loss with respect to the weights,
which you can then use with your favorite gradient descent method.
Backpropagation - example

(exercise: derive gradient wrt bias terms b)

The Navy revealed the embryo of The perceptron has shown itself
an electronic computer today that it worthy of study despite (and even
expects will be able to walk, talk, because of!) its severe limitations. It
see, write, reproduce itself and has many features to attract
be conscious of its existence […] attention: its linearity; its intriguing
Dr. Frank Rosenblatt, a research learning theorem; its clear
psychologist at the Cornell paradigmatic simplicity as a kind of
Aeronautical Laboratory, Buffalo, parallel computation. There is no
said Perceptrons might be fired to reason to suppose that any of
the planets as mechanical space these virtues carry over to the
explorers. many-layered version.
Nevertheless, we consider it to be
The New York Times an important research problem to
July 8th, 1958 elucidate (or reject) our intuitive
judgement that the extension to
multilayer systems is sterile.

Minsky and Papert 1969

(section 13.2)
Convolutional Neural Networks
for Object Recognition

General (excellent!) reference:

“Convolutional Networks for Visual Recognition”, Stanford university
http://cs231n.stanford.edu/
Traditional Object Detection/Recognition Idea

● Match low-level
vision features
(e.g. edge, HOG,
SIFT, etc)

● Parts-based
models (Lowe 2004)
Learning the features - inspiration from neuroscience

Hubel and Wiesel:

● Topographic organization of
connections
● Hierarchical organization of
simple/complex cells

(Hubel and Wiesel 1962)

(Fukushima 1980)
“Canonical” CNN structure
INPUT -> [[CONV -> RELU]*K -> POOL?]*L -> [FC -> RELU]*M -> FC
Credit: cs231n.github.io

Four basic operations:

1. Convolution
2. Nonlinearity (ReLU)
3. Pooling
4. Fully connected layers

(LeCun et al 1998)
2D Convolution

Example: blurring an image

Replacing each pixel with an

average of its neighbors
2D Convolution

kernel / filter

Input image Output image

2D Convolution

kernel / filter

Input image Output image

2D Convolution

kernel / filter

Input image Output image

2D Convolution

kernel / filter

Input image Output image

2D Convolution

kernel / filter

Input image Output image

2D Convolution

If N=input size, K=filter size, S=stride

(stride is the size of the step you take
on the input every time you move by
one on the output)

Output size = (N-K)/S + 1

kernel / filter
Input image Output image
N=32, K=5, S=1 →(N-K)/S + 1 = 28
More on convolution sizing
32 5x5x3 filter 32 28

1 1

32 32 28

1
3
3
Input depth = # of channels in previous layer
(often 3 for input layer (RGB); can be arbitrary Output depth = # of filters
for deeper layers) (feature maps)
Convolve with Different Filters

Ⓧ =
Convolution (with learned filters)

● Dependencies are local input

● Filter has few parameters to learn Multiple filters ...
○ Share the same parameters
across different locations

Feature map
Fully Connected vs. Locally Connected

Credit: Ranzato’s
CVPR 2014 tutorial
output

Non-linearity
input
● Rectified linear function (ReLU)
○ Applied per-pixel, output = max(0, input)

Input feature map Output feature map

Pooling
● Reduce size of representation in following layers
● Introduce some invariance to small translation

Image credit: http://cs231n.github.io/convolutional-networks/

Learning
LeNet - LeCun et al 1998
Backpropagation, gradient descent

Key evolutionary steps

Neocognitron - Fukushima 1980

Inspired by Hubel and Wiesel
“Convolutional” structure,
alternating “pooling” layers
AlexNet - Krizhevsky et al 2012
Larger, deeper network (~10^7 params), much more data (ImageNet -
~10^6 images), more compute (incl. GPUs), better regularization (Dropout)
Image classification Image retrieval

test
image
smallest Euclidian distance to test image

But also object detection, image segmentation, captioning...

Recurrent Neural Network
Handling Sequential Information

● Natural language processing: sentences, translations

● Speech / Audio: signal processing, speech recognition
● Video: action recognition, captioning
● Sequential decision making / Planning
● Time-series data
● Biology / Chemistry: protein sequences, molecule structures
● ...
Dynamic System / Hidden Markov Model

Classical form of a dynamic system Hidden Markov Model

With an external signal x

Recurrent Network / RNN

● A general form to process a sequence.

y ○ Applying a recurrence formula at each time step

● The state consists of a vector h.

It summarizes input up to time t.
RNN h

New state Old state Input at time t

A function with
x parameter W
Processing a Sequence: Unrolling in Time
I like this course

RNN RNN RNN RNN

predict predict predict predict

prediction prediction prediction prediction

Training: Backpropagation Through Time
I like this course

RNN RNN RNN RNN

predict predict predict predict

prediction prediction prediction prediction

loss loss loss loss

PRP VBP DT NN

Total loss
Parameter Sharing Across Time

● The parameters are shared and derivatives are accumulated.

● Make it possible to generalize to sequences of different lengths.

t
Vanishing Gradient

X f f f Loss

● expanded quickly!
○ |.| > 1, gradient explodes → clipping gradients
○ |.| < 1, gradient vanishes → introducing memory via LSTMs, GRUs

● Have problem in learning long-term dependency.

Long Short Term Memory (LSTM)

● Introducing gates to
optionally let information
flow through.
○ An LSTM cell has three
gates to protect and
control the cell state.

Forget the Selected update Output certain

irrelevant part of cell state values parts of the
previous state cell state Image credit: http://harinisuresh.com/2016/10/09/lstms/
Flexibility of RNNs

Image Sentiment Machine POS

Captioning Classification Translation Tagging

Image credit: Andrej Karpathy

Other Deep Learning Models
Auto-encoder

● Learning representations
○ a good representation should keep the information well

Encoder Decoder

Original input Learned Reconstructed

representation image

○ → objective: minimize reconstruction error

[LeCun, 1987]
latent variables:
color, shape, position, ...

Generative Models

● What are the learned representations?

observed data
○ One view: latent variables to generate the observed data

● Goal of learning a generative model: to recover p(x) from data

Desirable properties Problem

Sampling new data Directly computing
Evaluating likelihood of data
Extracting latent features
is intractable!

Adapt from IJCAI 2018 deep generative model tutorial

Variational Autoencoder (VAE)

p(z) p(x|z)
● Idea: approximate p(z|x)
with a simpler, tractable q(z|x)
z Decoder x
● Learning objective

Reconstruction error
z Encoder x

q(z|x) Measure how close q is to p

[Kingma et al., 2013]

Generative Adversarial Network (GAN)

● An implicit generative model, formulated as a minimax game.

○ The discriminator is trying to distinguish real and fake samples.
○ The generator is trying to generate fake samples to fool the discriminator.

[Goodfellow et al., 2014]

● Link to the slides
Thanks & Questions? ○ https://goo.gl/pUXdc1

● Hands-on session on
Monday!

Eugenio Piasini ([email protected])

Yen-Ling Kuo ([email protected])

ML LittelBook
No ratings yet
ML LittelBook
161 pages
Introduction To Deep Learning 17th January 2025
No ratings yet
Introduction To Deep Learning 17th January 2025
60 pages
CSCI417 Machine Intelligence - Lec11 RNN - V1
No ratings yet
CSCI417 Machine Intelligence - Lec11 RNN - V1
61 pages
Class Notes Unit 5
No ratings yet
Class Notes Unit 5
13 pages
LBDL A5 Booklet
No ratings yet
LBDL A5 Booklet
90 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
AI Slide 2
No ratings yet
AI Slide 2
82 pages
Deep Learning Guide by François Fleuret
No ratings yet
Deep Learning Guide by François Fleuret
156 pages
6S191 MIT DeepLearning L1
No ratings yet
6S191 MIT DeepLearning L1
108 pages
CV Mot
No ratings yet
CV Mot
69 pages
Deep Learning Techniques and Application
No ratings yet
Deep Learning Techniques and Application
20 pages
UNIT-III Convolution Neural Networks
No ratings yet
UNIT-III Convolution Neural Networks
9 pages
Deep Learning for Data Scientists
No ratings yet
Deep Learning for Data Scientists
75 pages
Deep Learning Curriculum
No ratings yet
Deep Learning Curriculum
23 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
167 pages
LBDL
No ratings yet
LBDL
185 pages
Tutorial Math Deep Learning 2018 PDF
No ratings yet
Tutorial Math Deep Learning 2018 PDF
103 pages
DLA Unit 4
No ratings yet
DLA Unit 4
38 pages
Listofpapers1 0
No ratings yet
Listofpapers1 0
8 pages
Lbdlu
No ratings yet
Lbdlu
168 pages
Convnets
No ratings yet
Convnets
41 pages
Session: Deep Learning: Module: Digital Image Processing Module Code: IMP302
No ratings yet
Session: Deep Learning: Module: Digital Image Processing Module Code: IMP302
34 pages
Deep Learning & Neural Networks Guide
No ratings yet
Deep Learning & Neural Networks Guide
64 pages
Unit.1.Introduction To Deep Learning
No ratings yet
Unit.1.Introduction To Deep Learning
10 pages
Introduction To Deep Learning: Nandita Bhaskhar
No ratings yet
Introduction To Deep Learning: Nandita Bhaskhar
56 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
04introduction To Neural Networks
No ratings yet
04introduction To Neural Networks
62 pages
Deep Learning for Tech Enthusiasts
No ratings yet
Deep Learning for Tech Enthusiasts
82 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
155 pages
Deep Learning in Neural Networks An Overview
No ratings yet
Deep Learning in Neural Networks An Overview
89 pages
Deep Learning Day 27
No ratings yet
Deep Learning Day 27
43 pages
L10-DL Intro
No ratings yet
L10-DL Intro
25 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
DL Tutorial NIPS2015 PDF
No ratings yet
DL Tutorial NIPS2015 PDF
133 pages
DL-19-CNN Sequential Model 210223
No ratings yet
DL-19-CNN Sequential Model 210223
18 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
001 Intro
No ratings yet
001 Intro
66 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
Mergeddv
No ratings yet
Mergeddv
2 pages
CNN 2
No ratings yet
CNN 2
47 pages
Deep Learning Course Overview
No ratings yet
Deep Learning Course Overview
30 pages
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
No ratings yet
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
43 pages
1 AI - Introduction and ML
No ratings yet
1 AI - Introduction and ML
32 pages
Convolutional Neural Networks Guide
No ratings yet
Convolutional Neural Networks Guide
31 pages
Intro to CNNs for Tech Enthusiasts
No ratings yet
Intro to CNNs for Tech Enthusiasts
31 pages
Deep Learning Resources Guide
No ratings yet
Deep Learning Resources Guide
5 pages
Deep Learning (DL) - Comprehensive Summary
No ratings yet
Deep Learning (DL) - Comprehensive Summary
9 pages
Unmasking - The - Truth - A - Deep - Learning - Approach - To - Detecting - Deepfake - Audio - Through - MFCC - Features - P
No ratings yet
Unmasking - The - Truth - A - Deep - Learning - Approach - To - Detecting - Deepfake - Audio - Through - MFCC - Features - P
8 pages
Deep Learning Fundamentals and ArchitecturesDeep Learning Fundamentals and Architectures
No ratings yet
Deep Learning Fundamentals and ArchitecturesDeep Learning Fundamentals and Architectures
9 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
Simple Shell Os
No ratings yet
Simple Shell Os
28 pages
AI Vision Models via Text Supervision
No ratings yet
AI Vision Models via Text Supervision
48 pages
Knowledge Distillation: A Survey
No ratings yet
Knowledge Distillation: A Survey
29 pages
IET ICICE 2024 PreProcAB 20241104 02
No ratings yet
IET ICICE 2024 PreProcAB 20241104 02
100 pages
Holberton School Software Engineering Syllabus
No ratings yet
Holberton School Software Engineering Syllabus
47 pages
Deep Learning Optimization Algorithms
No ratings yet
Deep Learning Optimization Algorithms
26 pages
1 PPT Glucoma Detection
No ratings yet
1 PPT Glucoma Detection
19 pages
Implementing Pointnet For Point Cloud Segmentation in The Heritage Context
No ratings yet
Implementing Pointnet For Point Cloud Segmentation in The Heritage Context
18 pages
Survey On Neural Machine Translation Into Polish: Proceedings of The 11th International Conference MISSI 2018
No ratings yet
Survey On Neural Machine Translation Into Polish: Proceedings of The 11th International Conference MISSI 2018
13 pages
Thesis
No ratings yet
Thesis
78 pages
Detecting Anatomical Landmarks From Limited Medical Imaging Data Using Two-Stage Task-Oriented Deep Neural Networks
No ratings yet
Detecting Anatomical Landmarks From Limited Medical Imaging Data Using Two-Stage Task-Oriented Deep Neural Networks
30 pages
FairMOT: Efficient Multi-Object Tracking
No ratings yet
FairMOT: Efficient Multi-Object Tracking
19 pages
CNNs: Bridging Human and AI Vision
No ratings yet
CNNs: Bridging Human and AI Vision
49 pages
Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges
No ratings yet
Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges
15 pages
Project Report
No ratings yet
Project Report
40 pages
Flat Unit 4
No ratings yet
Flat Unit 4
32 pages
V Sem Cse Exteranal QP
No ratings yet
V Sem Cse Exteranal QP
24 pages
Lecture 21 Semantic Segmentation
No ratings yet
Lecture 21 Semantic Segmentation
24 pages
Anonymous Subject Thoery
No ratings yet
Anonymous Subject Thoery
18 pages
Anonymous 2
No ratings yet
Anonymous 2
16 pages
Transformer For Object Detection Review and Benchmark
No ratings yet
Transformer For Object Detection Review and Benchmark
16 pages
Tablet Gaze Estimation for Researchers
No ratings yet
Tablet Gaze Estimation for Researchers
18 pages
Emotion-Based Music Player
No ratings yet
Emotion-Based Music Player
3 pages
Traffic Sign Detection
No ratings yet
Traffic Sign Detection
5 pages
Deep Learning For X Ray Image To Text Generation
No ratings yet
Deep Learning For X Ray Image To Text Generation
4 pages
Chang Generalist YOLO Towards Real-Time End-To-End Multi-Task Visual Language Models WACV 2025 Paper
No ratings yet
Chang Generalist YOLO Towards Real-Time End-To-End Multi-Task Visual Language Models WACV 2025 Paper
11 pages
Qbank ML
No ratings yet
Qbank ML
6 pages
Eyeriss Isscc 2016
No ratings yet
Eyeriss Isscc 2016
4 pages
Syllabus
No ratings yet
Syllabus
2 pages
Fine-Tuning Models Comparisons On Garbage Classification For Recyclability
No ratings yet
Fine-Tuning Models Comparisons On Garbage Classification For Recyclability
4 pages
Circular 3
No ratings yet
Circular 3
2 pages
Deep Learning for Image Captioning
No ratings yet
Deep Learning for Image Captioning
2 pages
Plagiarism Report
No ratings yet
Plagiarism Report
1 page
Face Recognition On Small-Scale Datasets
No ratings yet
Face Recognition On Small-Scale Datasets
6 pages

BMM 2018 - Deep Learning Tutorial

Uploaded by

BMM 2018 - Deep Learning Tutorial

Uploaded by

Deep Learning Tutorial

Brains, Minds, and Machines Summer Course 2018

TA: Eugenio Piasini & Yen-Ling Kuo

● Supervised Learning with Neural Nets

Logistic regression, support vector machines, decision trees, neural networks...

(McCulloch & Pitts 1943)

MLPs are universal function approximators (Cybenko 1989; Hornik 1989).

● No guarantee that the number of required units is reasonably small (expressivity).

The “errors” being

(exercise: derive gradient wrt bias terms b)

Minsky and Papert 1969

General (excellent!) reference:

Hubel and Wiesel:

(Hubel and Wiesel 1962)

Four basic operations:

Example: blurring an image

Replacing each pixel with an

Input image Output image

Input image Output image

Input image Output image

Input image Output image

Input image Output image

If N=input size, K=filter size, S=stride

Output size = (N-K)/S + 1

● Dependencies are local input

Input feature map Output feature map

Image credit: http://cs231n.github.io/convolutional-networks/

Key evolutionary steps

Neocognitron - Fukushima 1980

But also object detection, image segmentation, captioning...

● Natural language processing: sentences, translations

Classical form of a dynamic system Hidden Markov Model

With an external signal x

● A general form to process a sequence.

● The state consists of a vector h.

New state Old state Input at time t

RNN RNN RNN RNN

predict predict predict predict

prediction prediction prediction prediction

RNN RNN RNN RNN

predict predict predict predict

prediction prediction prediction prediction

loss loss loss loss

● The parameters are shared and derivatives are accumulated.

● Have problem in learning long-term dependency.

Forget the Selected update Output certain

Image Sentiment Machine POS

Image credit: Andrej Karpathy

Original input Learned Reconstructed

○ → objective: minimize reconstruction error

● What are the learned representations?

● Goal of learning a generative model: to recover p(x) from data

Desirable properties Problem

Adapt from IJCAI 2018 deep generative model tutorial

q(z|x) Measure how close q is to p

[Kingma et al., 2013]

● An implicit generative model, formulated as a minimax game.

[Goodfellow et al., 2014]

Eugenio Piasini ([email protected])

You might also like