0% found this document useful (0 votes)

11 views

Lecture 2

This document provides an overview of linear regression and machine learning concepts. It discusses: 1) Linear regression models assume a linear relationship between inputs (X) and outputs (Y). The goal is to learn parameters (β0, β1) that minimize the error between predicted and actual Y values. 2) Multiple linear regression extends this to multiple inputs. Matrix notation compactly represents the model as Y = Xβ + ε. 3) Nearest neighbor regression predicts Y for a new input X0 based on the average Y values of its K nearest neighbors in the training data. This is a nonparametric approach compared to linear regression.

Uploaded by

Wen Hsi Chua

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Lecture 2

Uploaded by

Wen Hsi Chua

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Big Data and Machine Learning

Lecture Slides 2: Linear Regression

University of Queensland
Outline

I Linear regression.
I Accuracy of linear regression.
I Problems with the linear regression model.
I Nearest-neighbor regression.
I First comparison of parametric (linear regression) vs.
nonparametric (nearest neighbor) learning.
Supervised learning setup
I Given a random variable X , predict another variable Y .
I Example:
I Y = Sales.
I X = Advertising.
I Solution:
I Learn a function fˆ from the data.
I Given input X , output

Y = fˆ(X )

I Simplest candidate function: linear function.

f (X ) = β0 + β1 X

I β0 and β1 are called parameters.

I They are unknown constants to be learned.
I This is called a parametric learning problem:
Simple linear regression

I Assume that the relationship between Y and X is given

approximately by

Y ≈ f (X ) = β0 + β1 X

I More precisely, The deviation by Y and f (X ) is assumed to

be modeled by a random error (or noise) ε

Y = f (X ) + ε
= β0 + β1 X + ε
How to learn the linear regression model?

I The learning of f in general is hard:

I For each x given, you should be able to output f (x): Infinity
of values to learn!!!!!
I But if f is a line, you only need two points to pin it down!!
I The learning of f (x) = β0 + β1 x only requires learning β0 and
β1 .
I Objective:
I Given a sample y1 , y2 , ..., yn and x1 , ..., xn ,
I find two values βˆ0 and βˆ1 ,
I that best fit the sample.
How to find the best fit?
I Look at a possible line and find the residuals

ei = yi − βˆ0 − βˆ1 xi

I Magnitude of the residuals

n
X
ε2i
i=1

I This is called the Sum of squared residuals.

Least Squares Minimization

I Objective function
n
X n
X
Q(β0 , β1 ) = (yi − β0 − β1 xi )2 = ε2i
i=1 i=1

I Find the two numbers β0 and β1 that makes Q smallest

possible.
Linear regression model in matrix form 1

I Matrices of output data, input data and residuals.

     
y1 1 x1 ε1
 ..   .. ..   .. 
 .   . .   . 
y X
     
 yi  ,
=  1 xi  , ε = 
= εi 
  
 ..   .. ..  .. 


 .   . .   . 
yn 1 xn εn

I Matrix of coefficients.

β0
β=
β1
Linear regression model in matrix form 2

I Original equations (i is the observation index)

y1 = β0 + β1 x1 + ε1
..
.
yi = β0 + β1 xi + εi
..
.
yn = β0 + β1 xn + εn

I Equations in matrix form

y = Xβ + ε
Least Squares Minimization 2

I The solution is very simple when written im matrix form.

!
= β̂ = (X T X )−1 X T y
βb0
β̂1

I As β̂ is random, how accurate it is given by its variance.

Var (β̂) = σ 2 (X T X )−1

I σ 2 is the variance of the residuals ε.
Multiple Linear regression: 2 inputs
I More than one input

I Y = β0 + β1 X1 + β2 X2 + ε
Multiple Linear regression

I Using matrix notation, going to p inputs is straightforward.

 
1 x1,1 · · · x1,p
 1 x2,1 x2,p
X =


 .. . .. .. 
 . . 
1 xn,1 · · · xn,p
I All the formulas in matrix form are the same!

y = Xβ + e
β̂ = (X T X )−1 X T y
Var (β̂) = σ 2 (X T X )−1
How good is the regression model?

I Is adding input Xj makes sense?

I How many inputs do we need?
I How good a linear regression model is?
I Is there a better model?
Prediction and fit

I From least squares, we get β̂.

I Given x0 , the prediction is

fˆ(x 0 ) = x 0 β̂
= β̂0 + β̂1 x 0,1 + · · · + β̂p x 0,p
I We can now compute
I Mean squared error. (It is linked to R 2 .)
I Test MSE. (Based on x0 in the test sample)
Special input data
I Example: a binary input:

1 if i is a student
studenti =
0 otherwise
I regression function.
balancei = β0 + β1 incomei + β2 studenti + εi

(β0 + β2 ) + β1 incomei + εi if i is a student
=
β0 + β1 incomei + εi otherwise
Discrete variable coding
I Example: a discrete input in the regression of Y on X

 red
X = blue
green


I Binary variable coding:

I Z1 = 1 if X = red and 0 otherwise.
I Z2 = 1 if X = blue and 0 otheriwse.
I Z3 = 1 if X = green and 0 otheriwse.
I Dummy variable trap
I Regression Y = β0 + β1 Z1 + β2 Z2 + β3 Z3 + ε
I 4 unknowns with three equations: Identification problem.
I E [Y |X = red] = β0 + β1
I E [Y |X = blue] = β0 + β2
I E [Y |X = green] = β0 + β3
I Drop one of the three variables Z1 , Z2 and Z3 .
Discrete variable coding (continued)

E [Y |X = red] + E [Y |X = blue] + E [Y |X = green] 3β0

= = β0
3 3
Polynomial regression
I Example from auto data. 2-degree polynomial.

mpgi = β0 + β1 horsepoweri + β2 horespower2i + εi

List of potential problems

I Relationship is nonlinear between inputs and outputs.

I Correlation of errors. (Time Series data, Panel data)
I Variance is error term is not constant.
I Outliers and high-leverage points.
I Colinear inputs.
I Endogenity → Causal inference issues.
Causal Inference and other pitfalls

I Simpson’s Paradox

I Interpretability and policy recommendations.

Nearest neighbor regression
I Nearest neighbor regression

I N0 is the set of K nearest neighbors to x0

1 X
fˆ(x0 ) = yi
K
xi ∈N0
Comparison of parametric and nonparametric regression

I Curse of dimensionality

Lovedeep Singh Bussiness Analytics
No ratings yet
Lovedeep Singh Bussiness Analytics
24 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
lecture 3
No ratings yet
lecture 3
33 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
2-Linear Regression
No ratings yet
2-Linear Regression
31 pages
Supervised Machine Learning - Regression
No ratings yet
Supervised Machine Learning - Regression
34 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Chapter4_Regression.docx
No ratings yet
Chapter4_Regression.docx
15 pages
Sta 3
No ratings yet
Sta 3
9 pages
21csc305p Ml Unit 2 Ppt
No ratings yet
21csc305p Ml Unit 2 Ppt
115 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Chapter2 Annotated Part2
No ratings yet
Chapter2 Annotated Part2
30 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Linear Regression
No ratings yet
Linear Regression
97 pages
Machine Learning Unit2
No ratings yet
Machine Learning Unit2
31 pages
FML Unit2
No ratings yet
FML Unit2
13 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Regression
No ratings yet
Regression
16 pages
DAUNIT-3
No ratings yet
DAUNIT-3
32 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Simple Regression Model
No ratings yet
Simple Regression Model
55 pages
Lecture 3 Multi-Regresion 2022.
No ratings yet
Lecture 3 Multi-Regresion 2022.
16 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
3 - Linear Regression-Least Square Error Fit
No ratings yet
3 - Linear Regression-Least Square Error Fit
35 pages
Isn't Linear Regression From Statistics?
No ratings yet
Isn't Linear Regression From Statistics?
4 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Linear Regression
No ratings yet
Linear Regression
60 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
ML_AI
No ratings yet
ML_AI
53 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
Lecture Note #8_PEC-CS701E
No ratings yet
Lecture Note #8_PEC-CS701E
20 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
Lect 10 Regression
No ratings yet
Lect 10 Regression
7 pages
GradientDescent-Regression_slides
No ratings yet
GradientDescent-Regression_slides
26 pages
Da Unit-3
No ratings yet
Da Unit-3
27 pages
Sparse Regression
No ratings yet
Sparse Regression
37 pages
StatLearning2r PDF
No ratings yet
StatLearning2r PDF
267 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
Machine Learning Class Slide
No ratings yet
Machine Learning Class Slide
44 pages
Regression_Questionnaire
No ratings yet
Regression_Questionnaire
10 pages
LinearRegression PDF
No ratings yet
LinearRegression PDF
4 pages
3CP10 Final MJJ Linear Regression
No ratings yet
3CP10 Final MJJ Linear Regression
68 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Tarea 1 Actuariales
No ratings yet
Tarea 1 Actuariales
1 page
Financial Tables PDF
No ratings yet
Financial Tables PDF
2 pages
Actuarial Data Cycle
100% (2)
Actuarial Data Cycle
3 pages
IND as 19_Bhavik Chokshi_FR Shield (1)
No ratings yet
IND as 19_Bhavik Chokshi_FR Shield (1)
8 pages
Regression
No ratings yet
Regression
7 pages
TP Regression
100% (1)
TP Regression
1 page
Edu Spring Ltam Sol PDF
100% (1)
Edu Spring Ltam Sol PDF
120 pages
Financial Engineering RPTU University of Kaiserslautern-Landau - Kaiserslautern
No ratings yet
Financial Engineering RPTU University of Kaiserslautern-Landau - Kaiserslautern
6 pages
VIETA LUVITASARI (60222191)_UAS SPSS
No ratings yet
VIETA LUVITASARI (60222191)_UAS SPSS
7 pages
Factors Affecting Happiness Score
No ratings yet
Factors Affecting Happiness Score
24 pages
Present Value Tables
No ratings yet
Present Value Tables
3 pages
CPH LEC Demography and Pop Estimates Reviewer
No ratings yet
CPH LEC Demography and Pop Estimates Reviewer
4 pages
Actuary Resume
100% (1)
Actuary Resume
4 pages
Prediction of New Observation
No ratings yet
Prediction of New Observation
13 pages
Life Insurance Mathematics (Advanced) : Jan Dhaene
No ratings yet
Life Insurance Mathematics (Advanced) : Jan Dhaene
2 pages
Practical Examples With STATA
No ratings yet
Practical Examples With STATA
36 pages
2024 STA2142 Tutorial Set 2
No ratings yet
2024 STA2142 Tutorial Set 2
3 pages
Data Uji Instrumen Soal Pilihan Ganda REVISI
No ratings yet
Data Uji Instrumen Soal Pilihan Ganda REVISI
63 pages
Generalized Linear Models - Ymod
No ratings yet
Generalized Linear Models - Ymod
3 pages
A STUDY ON CALCULATION OF RESERVES FOR JOINT WHOLE LIFE INSURANCE POLICY
No ratings yet
A STUDY ON CALCULATION OF RESERVES FOR JOINT WHOLE LIFE INSURANCE POLICY
9 pages
Actuarial Standards of Practice
100% (1)
Actuarial Standards of Practice
954 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
61 pages
أثر ممارسات التسويق الداخلي في ترسيخ أخلاقيات الأعمال دراسة حالة بنك الفلاحة والتنمية الريفية لولاية الأغواط
No ratings yet
أثر ممارسات التسويق الداخلي في ترسيخ أخلاقيات الأعمال دراسة حالة بنك الفلاحة والتنمية الريفية لولاية الأغواط
17 pages
Applying Machine Learning
No ratings yet
Applying Machine Learning
4 pages
Underwriting
No ratings yet
Underwriting
16 pages
Accenture Data Quality Key Solvency Requirements
No ratings yet
Accenture Data Quality Key Solvency Requirements
12 pages
Post Employee Benefit Psak 24 (Guide)
No ratings yet
Post Employee Benefit Psak 24 (Guide)
21 pages
04 Employee Compensation - Post-Employment and Share-Based
No ratings yet
04 Employee Compensation - Post-Employment and Share-Based
24 pages
Sample SPSS Project Paper #14-1
No ratings yet
Sample SPSS Project Paper #14-1
9 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Big Data and Machine Learning

Lecture Slides 2: Linear Regression

I Simplest candidate function: linear function.

I β0 and β1 are called parameters.

I Assume that the relationship between Y and X is given

I More precisely, The deviation by Y and f (X ) is assumed to

I The learning of f in general is hard:

I Magnitude of the residuals

I This is called the Sum of squared residuals.

I Find the two numbers β0 and β1 that makes Q smallest

I Matrices of output data, input data and residuals.

I Original equations (i is the observation index)

I Equations in matrix form

I The solution is very simple when written im matrix form.

I As β̂ is random, how accurate it is given by its variance.

Var (β̂) = σ 2 (X T X )−1

I Using matrix notation, going to p inputs is straightforward.

I Is adding input Xj makes sense?

I From least squares, we get β̂.

I Binary variable coding:

I Drop Z1 and use red as a base category.

E [Y |X = red] + E [Y |X = blue] + E [Y |X = green] 3β0

mpgi = β0 + β1 horsepoweri + β2 horespower2i + εi

I Relationship is nonlinear between inputs and outputs.

I Interpretability and policy recommendations.

I N0 is the set of K nearest neighbors to x0

You might also like