0% found this document useful (0 votes)
93 views

Presentation Day 3 - Lasso-Ridge Regression, Logistic Regression, SVM

1. Ridge and LASSO are regularization methods that decrease model complexity and variance by penalizing coefficients. 2. Ridge does not perform variable selection while LASSO does. 3. LASSO handles high-dimensional data better than ridge when it has a proper penalty term.

Uploaded by

Akhiyar Waladi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views

Presentation Day 3 - Lasso-Ridge Regression, Logistic Regression, SVM

1. Ridge and LASSO are regularization methods that decrease model complexity and variance by penalizing coefficients. 2. Ridge does not perform variable selection while LASSO does. 3. LASSO handles high-dimensional data better than ridge when it has a proper penalty term.

Uploaded by

Akhiyar Waladi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Ridge Regression and

LASSO Regression
Tubagus Dhafin Rukmanda
PACMANN AI Researcher
Email: [email protected]
+62 89620615729
Bias-Variance Trade-off Review

Validation data

Training data
How to deal with Overfitting?

- Subset selection
- Regularization
- Using a less complex model
Regularized Linear Regression
(Shrinkage Methods)
What is Regularization?

IDEA

Adding a penalty cost to the cost function, so that, the parameters can be shrunk
towards zero

Regularized OLS Cost Function = RSS + Regularizer


How can we penalize the cost function?

We can use L2 norm to perform a regularization called Ridge Regression,

Or,

We can use L1 norm to perform regularization called LASSO Regression.


Ridge Regression : Shrinkage Parameter

Note: We can choose the shrinkage parameter value by cross-validation error


Ridge Regression : Shrinkage Parameter
Ridge Regression : Shrinkage Parameter
Ridge Regression : Visualization

Image : ISL
Ridge Regression Improve Linear Regression

- Least Square : Variance


High but no bias
(relationship response and
predictor close to linear)
- Ridge : Reduction variance
but slight increase in bias
Black : Squared bias
Green : Variance
Purple : Test Error
Image : ISL
LASSO Regression: Cost Function

Linear Regression

Ridge Regression

LASSO Regression

Where,
LASSO Regression
LASSO Regression

Image : ISL
LASSO Regression
LASSO Regression
LASSO Regression
Black :
Squared bias

Green :
Variance

Purple :
Test Error

Image : ISL

- Lasso : Reduction variance but slight increase in bias


- Variance Ridge slightly lower than Variance Lasso
Ridge vs LASSO Regression
LASSO :
● Produce simpler and more interpretable models
(variable selection)
● Perform better where model have relatively small
number of predictors have substantial coefficients

Ridge :
● Have variance slightly lower
● Perform better where model have many predictors
have coefficients roughly equal size
Summary

1. Subset Selection, Ridge and LASSO are decreasing


complexity, decreasing variance, increasing bias (slower),
and increasing interpretability of a model.
2. Ridge doesn’t perform a variable selection, but LASSO
does.
3. LASSO can handle p>n easily when it has a proper
penalty term
Logistic Regression
Tubagus Dhafin Rukmanda
PACMANN AI Researcher
Email: [email protected]
+62 89620615729
The Logistic Regression
•  
The Logistic Model
• Using Linear Regression with
very large (or small) balance, we
will get values of default probability
bigger than 1 (or smaller than 0)

Picture from: Intoduction to Statistical Learning


The Logistic Model
Instead of a straight line
relationship, We need a
“Squashed” result, that is:
- Upper-bounded to 1
- Lower-bounded to 0

Picture from: Intoduction to Statistical Learning


The Logistic Model
We use our old Linear Regression function
and squased it in sigmoid function.

Picture from: Intoduction to Statistical Learning


Support Vector Machine
Tubagus Dhafin Rukmanda
PACMANN AI Researcher
Email: [email protected]
+62 89620615729
Support Vector Machine (SVM)
IDEA : Separating Data With Hyperplane
Data 1 Dimension/ Variable

Hyperplane
Data 2 Dimension/ Variables
Hyperplane
Data 3 Dimension/ Variables

Hyperplane

Stackoverfolw.com
Data n Dimension/ Variables ? n>3

Can’t Be Visualize
Separating Hyperplane
Separating Hyperplane
We can create infinite
hyperplane
Separating Hyperplane
Which hyperplane will we
choose?
Maximum Margin Classifier
M
ar
gi
n

Support Vector

 
Support Vector
 
Maximum Margin Classifier
M
ar
gi
n

Support Vector

 
 
Support Vector
 
 
What if Data can not be
Separated by Hyperplane?
We Use Soft Margin Classifier
(Support Vector Classifier)

Soft = Can be Violated by Some Observation


Support Vector Classifier
Support Vector Classifier
 

 
 

 
Max Margin Vs. SVC
Maximum Margin Classifier

or,

Support Vector Classifier

“cost” parameter
Support Vector Classifier

Support Vector Classifier

“cost” parameter
Support Vector Classifier
 

Mis-classified boundary
Support Vector Classifier
 

Mis-classified boundary
What if the data can not be separated by
linear boundaries?
Support Vector Machine and
Kernel
Kernel

Just imagine that using kernel is similar with adding new variable.
Mis-classified boundary
Kernel

Mis-classified boundary
Kernel Function

Aleksei Tiulpin
Kernel

Mis-classified boundary

Shuangyin Liu
Kernel

Mis-classified boundary

stackoverflow.com
Kernel

Mis-classified boundary

Mahmoud Elmezain
Kernel

Mis-classified boundary

Xiaochuan Li
Thank You, Question?
Tubagus Dhafin Rukmanda
PACMANN AI Researcher
Email: [email protected]
+62 89620615729

You might also like