0% found this document useful (0 votes)

1K views

YLP Logistic Regression

The document provides an overview of logistic regression for binary classification problems. It discusses how logistic regression can be used to predict things like loan defaults or medical diagnoses. Unlike linear regression, logistic regression uses a sigmoid curve to model binary outcomes as probabilities between 0 and 1. It finds the best fitting curve by varying the beta coefficients to maximize the likelihood of classifying the training data correctly.

Uploaded by

SandeepKumarByroju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views

YLP Logistic Regression

Uploaded by

SandeepKumarByroju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 61

Y.

LAKSHMI PRASAD
08978784848
Objectives
 Introduction to Logistic regression
 Some Potential Problems and Solutions
 Probability and Odds
 Assumptions of Logistic Regression
 Interpreting Coefficients in Logistic Regression
 Evaluating the performance of the model

Y. Lakshmi Prasad :08978784848

Binary Classification
 The most common use of logistic regression models
is in binary classification problems.
 Bank customers seek loans from the bank promising
to repay the loan in installments over a determined
period of time and with some interest on the amount.
However, banks are always at a risk because many
customers might not be able to pay their loans back.
This can cause big losses to the bank.

Y. Lakshmi Prasad :08978784848

Classification Problem
 Therefore, predicting credit risk is of utmost importance for
the bank where the bank analyzes customers’ information
and credit history before deciding to grant a loan.
 Logistic Regression can be used to build a predictive
modeling to predict how likely a customer is to default the
repayment of the loan.

Y.Lakshmi Prasad 08978784848

What is Classification?
PARTITIONING the (FEATURE)
SPACE into PURE REGIONS assigned
to each CLASS
Two types of Classifiers

Descriptive (Generative) Classifiers Discriminative Classifiers

 Learn class DENSITY  Learn class SEPARATORS
Functions  Logistic Regression
 Bayesian Classifiers  Decision Trees
 Nearest Neighbor  Neural Networks, SVM
Approaches to learn classifiers
 1. Logistic Regression
 2. K-Nearest Neighbour classifiers
 3. Decision Trees
 4. Bayesian Classifiers(NaiveBayes)
 5. Support Vector Machines
 6. Neural Networks
 7. Bagging (Random Forest)
 8. Boosting (Ada Boost, XG Boost)

Y. Lakshmi Prasad :08978784848

Expectations from Model
 Predictive (Classification ) accuracy: The ability of the
model to correctly predict the class label of new or
previously unseen data.
 1. Accuracy = Percent (%) of testing set examples
correctly classified by the classifier.
 2. Speed: The computation costs involved in generating
and using the model should be as minimum as possible.
 3. Versatile: Ability of the model to make correct
predictions given noisy data or data with missing values.
 4. Scalability: Ability to construct the model efficiently
given large amount of data.

Y. Lakshmi Prasad :08978784848

Expectations from Model
 5. Interpretability/ Explain-ability: Level of
understanding and insight that is provided by the model
 6. Generalized: It should be able to give the same
amount of accuracy on validation set.
 7. Deterministic: I expect my model to give me the
same result every time I run it.
 8. Regularization: I expect my model complexity
should be controlled by me, So that I can prevent my
model from over-fitting.

Y. Lakshmi Prasad :08978784848

Logistic Regression
 Logistic Regression is a supervised classification model.
 It allows you to make predictions from labelled data, if the
target (output) variable is categorical.
 1. A bank wants to predict, based on some variables,
whether a particular customer will default on a loan or not
 2. A factory manager wants to predict, based on some
variables, whether a particular machine will
 break down in the next month or not
 3. Google’s backend wants to predict, based on some
variables, whether an incoming email is spam or not

Y. Lakshmi Prasad :08978784848

Logistic Regression examples
 Marketing: Classify whether the Lead is a Hot Lead or a
warm lead.
 Stock Market: predict whether the stock will
outperform or under perform
 Healthcare: Whether the tumour is malignant or
benign
 Networking: classify whether the packet is malicious or
not.
 Human Resource: whether the employee is going to
leave the company(attrition) or not

Y.Lakshmi Prasad 08978784848

Logistic Regression
 Used because having a categorical outcome variable violates
the assumption of linearity in normal regression.
 Instead of building a predictive model for "Y (Response)"
directly, the approach models “Log Odds (Y)”; hence the
name Logistic or Logit.

Y.Lakshmi Prasad 08978784848

Where is the Problem?
 Dependent variable is limited to the [0 or 1]
because we have 2 classes: default or no-default,
Diabetic or Non diabetic, Churned or not churned
etc.,.
 Linear Regression is designed to solve the problem
of Minimizing the Root Mean Squared Error, which
does not seem to be an appropriate fit in this case.

Y. Lakshmi Prasad :08978784848

Logistic Regression
 Let us take the diabetes example, in this example,
we try to predict whether a person has diabetes or not,
based on that person’s blood sugar level.
 Why a simple boundary decision approach does not
work very well for this example.
 It would be too risky to decide the class blatantly on
the basis of cut-off, as especially in the middle, the
patients could basically belong to any class, diabetic or
non-diabetic.

Y.Lakshmi Prasad 08978784848

Classifying with Linear Regression

Y. Lakshmi Prasad :08978784848

Where is the Problem?
 recall the graph of the diabetes example.
 Suppose there is another person, with a blood sugar
level of 195, and you do not know whether that person
has diabetes or not. What would you do then? Would
you classify him/her as a diabetic or as a non-diabetic?

Y.Lakshmi Prasad 08978784848

Step Function

Y.Lakshmi Prasad 08978784848

Limitation of Step Function
 Now, based on the boundary, you may be tempted to
declare this person a diabetic, but can you really do
that?
 This person’s sugar level (195 mg/dL) is very close to
the threshold (200 mg/dL), below which people are
declared as non-diabetic.
 It is, therefore, quite possible that this person was just a
non-diabetic with a slightly high blood sugar level.
 After all, the data does have people with slightly high
sugar levels (220 mg/dL), who are not diabetics.

Y.Lakshmi Prasad 08978784848

Classifying with Linear Regression

Y.Lakshmi Prasad 08978784848

Classifying with Linear Regression
 The main problem with a straight line is that it is not
steep enough.
 In the sigmoid curve, as you can see, you have low
values for a lot of points, then the values rise all of a
sudden, after which you have a lot of high values.
 In a straight line though, the values rise from low to
high very uniformly, and hence, the “boundary” region,
the one where the probabilities transition from high to
low is not present.

Y.Lakshmi Prasad 08978784848

Logistic Regression
 In this situation, We actually like to talk in terms of
probability.
 One such curve which can model the probability of
diabetes very well, is the sigmoid curve.

Y.Lakshmi Prasad 08978784848

sigmoid curve
 sigmoid curve has all the properties you would want
— extremely low values in the start, extremely high
values in the end, and intermediate values in the
middle — it’s a good choice for modelling the value of
the probability of diabetes.

Y.Lakshmi Prasad 08978784848

S-Curve

Y.Lakshmi Prasad 08978784848

S-Curve
 Here, We want to have the P1,P2,P3,P4,P6 to be as small
as possible and P5,P7,P8,P9,P10 to be as high as
possible.
 In case of P4, I can say either I want to minimize P4 or I
can even say I want to maximize 1-P4.
 Comprehensively, If I want to maximize all these points
into the same assumption, I can say I want to maximize
P5,P7,P8,P9,P10 and 1-P1,1-P2,1-P3,1-P4,1-P6.
 That means I want to maximize the product of all these
points. So, we want to find out that B0 and B1 which
maximizes the product of all these points.

Y.Lakshmi Prasad 08978784848

Best-fit Curve
 The next step, just like linear regression, would be to
find the best fit curve.
 Hence, you learnt that in order to find the best fit
sigmoid curve, you need to vary β0 and β1 until you get
the combination of beta values that maximises the
likelihood.

Y.Lakshmi Prasad 08978784848

Understanding Likelihood
 Now, let’s say that for the nine points in our example, the
labels are as follows:

Point no. 1 2 3 4 5 6 7 8 9

Diabetes no no no yes no yes yes yes yes

In this case, the likelihood would be equal to:

 A) (1−P1)(1−P2)(1−P3)(1−P4)(1−P5)(P6)(P7)(P8)(P9)
 B) (1−P1)(1−P2)(1−P3)(1−P5)(P4)(P6)(P7)(P8)(P9)
 C) (P1)(P2)(P3)(P4)(P5)(1−P6)(1−P7)(1−P8)(1−P9)
 D) (P1)(P2)(P3)(P5)(1−P4)(1−P6)(1−P7)(1−P8)(1−P9)

Y.Lakshmi Prasad 08978784848

Answer
 Answer B :Recall that likelihood is the product
of (1−Pi) for all non-diabetic patients and (Pi) for all
diabetic patients. Hence, the likelihood is given
by (1−P1)(1−P2)(1−P3)(1−P5), (all non-diabetic patients)
multiplied by (P4)(P6)(P7)(P8)(P9)(P10) (all diabetic
patients).

Y.Lakshmi Prasad 08978784848

Logistic Regression Best fit curve

Y.Lakshmi Prasad 08978784848

Logistic Regression Best fit curve
 If you had to find β0 and β1 for the best fitting sigmoid
curve, you would have to try a lot of combinations,
unless you arrive at the one which maximises the
likelihood.
 This is similar to linear regression, where you
vary β0 and β1 until you find the combination that
minimises the cost function.
 Hence, this is called a Generalised Linear regression
Model (GLM), or a logistic regression model.

Y.Lakshmi Prasad 08978784848

ODDS
 The odds has a range of 0 to  with values greater than 1
associated with an event being more likely to occur than
to not occur and values less than 1 associated with an
event that is less likely to occur than not occur

p
odds 
1 p

Y. Lakshmi Prasad :08978784848

Log(Odds)
• It solves the problem we encounter in fitting a linear
model to probabilities.
• As probabilities (the dependent variable) only range
from 0 to 1, we can get linear predictions that are
outside of this range

 p 
ln  odds   ln    ln  p   ln 1  p 
 1 p 

Y. Lakshmi Prasad :08978784848

What is an "Odds Ratio”?
 It is a standard statistical term that denotes probability
of success to probability of failure.
 If probability of success is 0.75, then odds ratio =
(0.75/0.25) = 3
 In other words, there is a 3:1 chance of success
 If probability of success is 50%, what is the odds ratio?

Y. Lakshmi Prasad :08978784848

Why Use Odds-Ratio
• In logistic regression the Odds Ratio represents the
constant effect of a predictor X, on the likelihood that
one outcome will occur.
• In regression models, we often want a measure of the
unique effect of each X on Y.
• If we try to express the effect of X on the likelihood of a
categorical Y having a specific value through Probability,
the effect is not constant.

Y. Lakshmi Prasad :08978784848

Why Use Odds-Ratio
• That means there is no way to express in one number
how X affects Y in terms of Probability.
• The effect of X on the probability of Y has different
values depending on the value of X.
• We will not be able to describe that effect in a single
number using Probability and will have to use Odds
Ratio

Y.Lakshmi Prasad 08978784848

The Logistic Regression Model
The "logit" model:

ln[p/(1-p)] = 0 + 1X

• p is the probability that the event Y occurs, p(Y=1)

[range=0 to 1]

• p/(1-p) is the "odds ratio"

[range=0 to ∞]

• ln[p/(1-p)]: log odds ratio, or "logit“

[range=-∞ to +∞]

Y. Lakshmi Prasad :08978784848

Logistic regression Model

Y. Lakshmi Prasad :08978784848

Types of Logistic Regression
 1. Binary Logit: Used when the response variable is binary or
dichotomous. It has only 2 outcomes e.g. Good v/s Bad,
Yes v/s No
 2. Multinomial Logit: Used when the response variable has more
than 2 outcomes, and The outcomes cannot be ordered in any
manner e.g. Choice of bread.
 3. Ordered Logit: Used when the response variable has more
than 2 outcomes, and The outcomes can be ordered in a
meaningful way.
High / Medium / Low,
Strongly Agree / Agree / Disagree / Strongly Disagree

Y. Lakshmi Prasad :08978784848

Response Variable coding
 Data Preparation for Logistic Regression includes:
 The response variable (or target variable) will need to
be converted to a 1,0
 Code "Sanctioned personal loan" as "1" and "Rejected
personal loan" as "0"

Y. Lakshmi Prasad :08978784848

Considerations
 1. Missing value treatment - using logical rules
 2. Outlier detection - to ensure we don't have highly
skewed values
 3. Multicollinearity- two independent variables do
not provide similar information
 4. Variable transformations- we have meaningful
transformation of variables depending on the research
and modeling scope
 5. Descriptive statistics - Basic measures of central
tendency need to be output to validate if correct data is
being used for modeling

Y. Lakshmi Prasad :08978784848

Assumptions
 Makes fewer assumptions than Linear Regression
 Non-linear transformation of the logistic function
 No assumption on normal distribution for residuals.
 No Homoscedastic assumption.

Y. Lakshmi Prasad :08978784848

Data Preparation Partitioning
 Divide the sample into 2 sub-samples
 1. Training Sample: Sample used to build Logistic
regression model.
 2. Validation Sample: Estimates obtained from the
development sample will be tested here for comparison
and checking robustness of the model

Y. Lakshmi Prasad :08978784848

Simple Decision Boundary?
Medium Decision Boundary!
Complex Decision Boundary!
Model SIGNAL not NOISE

Model is too simple  UNDER LEARN

Model is too complex  MEMORIZE

Model is just right  GENERALIZE

Generalization vs. Memorization
Generalization: The ability to predict or
assign a label to a “new” observation based
on the “model” built from past experience
Generalize, don’t Memorize!
Training Set Accuracy
Right Level of Model
Complexity
Model Accuracy

Validation Set Accuracy

Model Complexity
Questions for Classification!
 What is the NATURE of classifier’s DECISION BOUNDARY?
 What is the COMPLEXITY of classifier’s DECISION
BOUNDARY?
 How do I CONTROL the COMPLEXITY of the classifier?
 How do I know when the classifier is COMPLEX ENOUGH?
 How to pick the right CLASSIFIER to use?
Metrics to Evaluate
 1. Confusion Matrix (Accuracy, Sensitivity, Specificity )
 2. Receiver Operating characteristic Curve
 3. Weight of Evidence
 4. Concordant, Discordant, Tied Pairs
 5. Area Under Curve (c-Statistic)
 6. Akaike Information Criterion
 7. Gini Coefficient

Y. Lakshmi Prasad :08978784848

Event Rate
 Event rate is a statistical term that describes how often
an event occurs.
 Divide the number of times the occurrence happened by
the total times the occurrence could happen to
determine the event rate.

Y. Lakshmi Prasad :08978784848

Confusion Matrix

Y. Lakshmi Prasad :08978784848

Sensitivity & Specificity

Y. Lakshmi Prasad :08978784848

Receiver Operating Characteristic (ROC) Curve

Tradeoff between sensitivity

and specificity

The closer the curve follows the

left-hand border and then the top
border of the ROC space, the more
accurate the test.

The closer the curve comes to the 45-

degree diagonal of the ROC space, the
less accurate the test

Y. Lakshmi Prasad :08978784848

Weight of Evidence (WoE)
 The weight of evidence tells the predictive power of an
independent variable in relation to the dependent
variable.
 The WoE recoding of predictors is particularly well suited
for subsequent modeling using Logistic Regression.
 For a continuous variable, split data into 10 parts (or
lesser depending on the distribution).
 Calculate WOE by taking natural log of division of % of
non-events and % of events.

Y. Lakshmi Prasad :08978784848

Information Value
 Information value is a useful concept for variable selection
during model building.
 It helps to rank variables on the basis of their importance

 Information Value -- Variable Predictive Power

< 0.02 -- Not useful
> 0.02 to 0.1 -- Weak predictive power
> 0.1 to 0.3 -- Medium predictive power
> 0.3 to 0.5 -- Strong predictive power
> 0.5 -- Suspicious predictive power
Y. Lakshmi Prasad :08978784848
Concordant, Discordant
 Concordant : Percentage of pairs where the
observation with the event has a higher predicted
probability than the observation without the event
 Percent Concordant = (Number of concordant
pairs)/Total number of pairs
 Discordant : Percentage of pairs where the
observation with the event has a lower predicted
probability than the observation without the event
 Percent Discordance = (Number of discordant
pairs)/Total number of pairs

Y. Lakshmi Prasad :08978784848

c-Statistic
 Tied : Percentage of pairs where the observation with the event
has same predicted probability than the observation without
the event
 Percent Tied = (Number of tied pairs)/Total number of pairs

 c statistics : Also called area under curve (AUC). It is calculated

by adding Concordance Percent and 0.5 times of Tied Percent
 C- statistic = Percent Concordant + 0.5 * Percent Tied

Higher percentages of concordant pairs and lower percentages of

discordant and tied pairs indicate a more desirable model.

Y. Lakshmi Prasad :08978784848

Akaike Information Criterion (AIC)
 The Akaike Information Criterion (AIC) provides a
method for assessing the quality of model through
comparison of related models.
 It’s based on the Deviance, but penalizes for making the
model more complicated.
 Much like adjusted R-squared, it’s intent is to prevent
from including irrelevant predictors.
 If you have more than one similar candidate models then
you should select the model that has the smallest AIC.

Y. Lakshmi Prasad :08978784848

Gini Coefficient
 Gini coefficient can be straight
away derived from the AUC
ROC number.
 Gini is nothing but ratio
between area between the
ROC curve and the diagonal
line & the area of the above
triangle
 Gini above 60% is a “good”
model.

Y. Lakshmi Prasad :08978784848

Imbalanced Datasets – SMOTE
Synthetic Minority Over-sampling Technique:
 Imbalanced data sets are a special case for classification
problem where the class distribution is not uniform
among the classes.
 These type of sets gives a challenge because this creates a
bias towards the majority class.
 Oversampling involves using a bias to select more
samples from one class than from another.
 The general idea of this method is to artificially generate
new examples of the minority class using the nearest
neighbors of these cases.
 The majority class examples are also under-sampled,
leading to a more balanced dataset
Y. Lakshmi Prasad :08978784848
Questions?

Fault Code 428 Water-in-Fuel Indicator Sensor Circuit - Voltage Above Normal or Shorted To High Source
100% (3)
Fault Code 428 Water-in-Fuel Indicator Sensor Circuit - Voltage Above Normal or Shorted To High Source
14 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Not Yet Corpses (Still, We Rot) - Itotypes - 文豪ストレイドッグス Bungou Stray Dogs (Archive of Our Own)
No ratings yet
Not Yet Corpses (Still, We Rot) - Itotypes - 文豪ストレイドッグス Bungou Stray Dogs (Archive of Our Own)
1 page
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
No ratings yet
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
6 pages
Logistic Regression Lecture Notes
No ratings yet
Logistic Regression Lecture Notes
11 pages
Logistic Regression[2]
No ratings yet
Logistic Regression[2]
36 pages
11-Logistic Regression
No ratings yet
11-Logistic Regression
27 pages
ML Exp 8
No ratings yet
ML Exp 8
22 pages
ML CLASS 5 Logistic Regression Algorithm
No ratings yet
ML CLASS 5 Logistic Regression Algorithm
16 pages
Lecture Notes - Logistic Regression
100% (1)
Lecture Notes - Logistic Regression
11 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
No ratings yet
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
33 pages
Logistic Regression Lecture Notes
No ratings yet
Logistic Regression Lecture Notes
11 pages
Logistic - Regression Class 3
No ratings yet
Logistic - Regression Class 3
88 pages
Logistic Regression
100% (1)
Logistic Regression
56 pages
Lecture 6
No ratings yet
Lecture 6
19 pages
ML_MU_Unit_2 - Supervised Learning-Classification Techniques
No ratings yet
ML_MU_Unit_2 - Supervised Learning-Classification Techniques
153 pages
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
100% (1)
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
76 pages
02 LogisticRegression
No ratings yet
02 LogisticRegression
29 pages
Chapter 4 Statistical Classification Methods
No ratings yet
Chapter 4 Statistical Classification Methods
63 pages
G26_report
No ratings yet
G26_report
4 pages
Chapter 10 Logistic Reg
No ratings yet
Chapter 10 Logistic Reg
29 pages
ML 4
No ratings yet
ML 4
80 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
BA TopicB LoR
No ratings yet
BA TopicB LoR
29 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Logistic Regression Essentials in R - Articles - STHDA
No ratings yet
Logistic Regression Essentials in R - Articles - STHDA
10 pages
Basic ML Algorithm
No ratings yet
Basic ML Algorithm
74 pages
Fai Module 3
No ratings yet
Fai Module 3
67 pages
ML Unit-IV Notes
No ratings yet
ML Unit-IV Notes
49 pages
class
No ratings yet
class
102 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
ML Mod 2
No ratings yet
ML Mod 2
13 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Chap10 Logistic Regression
No ratings yet
Chap10 Logistic Regression
36 pages
Logistic Regression and Discriminant Analysis: Jerry D.T. Purnomo, PH.D
No ratings yet
Logistic Regression and Discriminant Analysis: Jerry D.T. Purnomo, PH.D
54 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
ML - Unit 2
No ratings yet
ML - Unit 2
155 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
Business Analytics & Machine Learning: Logistic and Poisson Regressions
No ratings yet
Business Analytics & Machine Learning: Logistic and Poisson Regressions
62 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
Session 5 - Logistic Regression
No ratings yet
Session 5 - Logistic Regression
69 pages
logisticregression
No ratings yet
logisticregression
22 pages
Chapter 4 Statistical Classification Methods
No ratings yet
Chapter 4 Statistical Classification Methods
73 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
04 Probability and Learning PDF
No ratings yet
04 Probability and Learning PDF
34 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
AST Day 2 Slides
No ratings yet
AST Day 2 Slides
58 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
2 Modele lineare
No ratings yet
2 Modele lineare
43 pages
Chapter 10 - Logistic Regression: Data Mining For Business Intelligence
No ratings yet
Chapter 10 - Logistic Regression: Data Mining For Business Intelligence
20 pages
Logistic Regression and Naive Bayes
No ratings yet
Logistic Regression and Naive Bayes
4 pages
Chap10_LogisticRegression
No ratings yet
Chap10_LogisticRegression
19 pages
07 Logistics Regression
No ratings yet
07 Logistics Regression
23 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
FALLSEM2024-25 BCSE401L TH VL2024250102078 2024-09-04 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE401L TH VL2024250102078 2024-09-04 Reference-Material-I
27 pages
Les Liaisons Dangereuses
No ratings yet
Les Liaisons Dangereuses
329 pages
Overseas Employment Application Form
No ratings yet
Overseas Employment Application Form
3 pages
PPEs For Website - Revised Charges 2 PDF
No ratings yet
PPEs For Website - Revised Charges 2 PDF
1 page
All chapter download Human Motivation 6th Edition Franken Test Bank
100% (4)
All chapter download Human Motivation 6th Edition Franken Test Bank
46 pages
Stephen Bailey - Academic Writing - A Handbook For International Students-Routledge (2004) - 34-36
No ratings yet
Stephen Bailey - Academic Writing - A Handbook For International Students-Routledge (2004) - 34-36
3 pages
FAA 14 CFR Part 25 Amendment 25-138 and CS-25 Amendment 15
No ratings yet
FAA 14 CFR Part 25 Amendment 25-138 and CS-25 Amendment 15
7 pages
NIOSH Confined Spaces 1986 PDF
No ratings yet
NIOSH Confined Spaces 1986 PDF
22 pages
Quarter 1 - Module 3 Writing and Naming Chemical Compounds: Yolanda A. Peñalosa
100% (5)
Quarter 1 - Module 3 Writing and Naming Chemical Compounds: Yolanda A. Peñalosa
14 pages
Patogenesis of Epstein Barr Virus y Los Linfomas 2016
No ratings yet
Patogenesis of Epstein Barr Virus y Los Linfomas 2016
12 pages
Class X B practical 24-24
No ratings yet
Class X B practical 24-24
50 pages
RODGEN Fs Subject For Sofbound
No ratings yet
RODGEN Fs Subject For Sofbound
40 pages
Job Satisfaction Analysis Report
No ratings yet
Job Satisfaction Analysis Report
23 pages
Operational Control Procedure
50% (2)
Operational Control Procedure
28 pages
MENPIN NCP (Dialysis) Aa
No ratings yet
MENPIN NCP (Dialysis) Aa
2 pages
Naturally Triple Your Testoster - Peter Paulson
100% (4)
Naturally Triple Your Testoster - Peter Paulson
71 pages
LIPOMD
No ratings yet
LIPOMD
288 pages
Concept Map of Water Cycle
100% (1)
Concept Map of Water Cycle
3 pages
Manual TV LED 42LA6600 PDF
100% (1)
Manual TV LED 42LA6600 PDF
100 pages
Processed Products Made of Chicken Meat: Chicken Sausages Sausages Containing Meat Mixes Including Chicken Meat
No ratings yet
Processed Products Made of Chicken Meat: Chicken Sausages Sausages Containing Meat Mixes Including Chicken Meat
8 pages
영어+고3+수능특강+4강.hwp
No ratings yet
영어+고3+수능특강+4강.hwp
5 pages
Invasive Earthworms PDF
No ratings yet
Invasive Earthworms PDF
19 pages
Incident Action Plan
100% (1)
Incident Action Plan
5 pages
Ortho Foot Exam
No ratings yet
Ortho Foot Exam
2 pages
BKD Accomplishment Report
No ratings yet
BKD Accomplishment Report
3 pages
Diabetes Mellitus 1
No ratings yet
Diabetes Mellitus 1
16 pages
About UL DOOR PDF
100% (3)
About UL DOOR PDF
36 pages
NS300 Installation Manual PDF
No ratings yet
NS300 Installation Manual PDF
300 pages
Amber
No ratings yet
Amber
10 pages

YLP Logistic Regression

Uploaded by

YLP Logistic Regression

Uploaded by

Y.

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y.Lakshmi Prasad 08978784848

Descriptive (Generative) Classifiers Discriminative Classifiers

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y. Lakshmi Prasad :08978784848

Y.Lakshmi Prasad 08978784848

Y. Lakshmi Prasad :08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Diabetes no no no yes no yes yes yes yes

In this case, the likelihood would be equal to:

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y.Lakshmi Prasad 08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y.Lakshmi Prasad 08978784848

ln[p/(1-p)] = 0 + 1X

• p is the probability that the event Y occurs, p(Y=1)

• p/(1-p) is the "odds ratio"

• ln[p/(1-p)]: log odds ratio, or "logit“

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Model is too simple  UNDER LEARN

Model is just right  GENERALIZE

Validation Set Accuracy

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Tradeoff between sensitivity

The closer the curve follows the

The closer the curve comes to the 45-

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

 Information Value -- Variable Predictive Power

Y. Lakshmi Prasad :08978784848

 c statistics : Also called area under curve (AUC). It is calculated

Higher percentages of concordant pairs and lower percentages of

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

Y. Lakshmi Prasad :08978784848

You might also like