Linear Regression
Linear Regression
Regression
• In regression the output is continuous
• Many models could be used – Simplest is linear
regression
• Fit data with the best hyper-plane which "goes through" the
points
y
dependent
variable
(output)
4
Types of Regression Models
Regression
1 feature Models 2+ features
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Linear regression
• Given an input x compute an
output y
• For example:
Y
- Predict height from age
- Predict house price from
house area
- Predict distance from wall
from sensors
X
Simple Linear Regression Equation
E(y)
Regression line
Intercept
Slope β1
b0
x
Linear Regression Model
• Relationship Between Variables Is a Linear Function
Y=𝛽0 + 𝛽1 𝑥1 + 𝜖
House Number Y: Actual Selling X: House Size (100s
Price ft2)
1 89.5 20.0
2 79.9 14.8
3 83.1 20.5 Sample 15
4 56.9 12.5 houses
5 66.6 18.0 from the
6 82.5 14.3 region.
7 126.3 27.5
8 79.3 16.5
9 119.9 24.3
10 87.6 20.2
11 112.6 22.0
12 120.8 .019
13 78.5 12.3
14 74.3 14.0
15 74.8 16.7
Averages 88.84 18.17
House price vs size
Linear Regression – Multiple Variables
Yi = b0 + b1X1 + b2 X2 + + bp Xp +e
11
Regression Model
• Our model assumes that
E(Y | X = x) = b0 + b1x (the “population line”)
Population Yi = b0 + b1X1 + b2 X2 + + bp Xp +e
line
[ y − (b
i =1
i 0 + b1 xi )]
2
• To find the values for the coefficients which minimize the objective function we
take the partial derivates of the objective function (SSE) with respect to the
coefficients. Set these to 0, and solve.
n å xy - å x å y å y - b åx
b1 = b0 =
1
nå x 2 - (å x )
2
n
15
Multiple Linear Regression
Y = b 0 + b1 X 1 + b 2 X 2 + + b n X n
𝑛
ℎ 𝑥 = 𝛽𝑖 𝑥𝑖
𝑖=0
• There is a closed form which requires matrix
inversion, etc.
• There are iterative techniques to find weights
• delta rule (also called LMS method) which will update
towards the objective of minimizing the SSE.
16
Liner Regression
• Linear regression is a quiet and simple statistical regression method
used for predictive analysis and shows the relationship between the
continuous variables.
• Linear regression shows the linear relationship between the
independent variable (X-axis) and the dependent variable (Y-axis),
consequently called linear regression.
• If there is a single input variable (x), such linear regression is
called simple linear regression. And if there is more than one input
variable, such linear regression is called multiple linear regression.
• The linear regression model gives a sloped straight line describing the
relationship within the variables.
• The above graph presents the linear relationship between the
dependent variable and independent variables.
• When the value of x (independent variable) increases, the value of y
(dependent variable) is likewise increasing. The red line is referred to
as the best fit straight line.
• To calculate best-fit line linear regression uses a traditional slope-
intercept form.
Data Model in simple Linear Regression
Example
Practice Exercise 2
Answer
Random Error (Loss) Identification
Cost function
• The cost function helps to figure out the best possible values for a0
and a1, which provides the best fit line for the data points.
• In Linear Regression, Mean Squared Error (MSE) cost function is
used, which is the average of squared error that occurred between
the predicted values and actual values.
• By simple linear equation y=mx+b we can calculate MSE
Gradient descent
• Gradient descent is a method of updating a0 and a1 to minimize the
cost function (MSE).
• A regression model uses gradient descent to update the coefficients
of the line by reducing the cost function by a random selection of
coefficient values and then iteratively update the values to reach the
minimum cost function.
Multiple Linear Regression
Assignment 2
Implement Simple Linear Regression in Python (Dataset is uploaded in
the Assignment link)