Parametric and Non-Parametric Statistical Testing
Parametric and Non-Parametric Statistical Testing
Work Plan
Parametric and non-parametric statistical testing
2016/2017
Pedro Campos / Paula Brito
Inference
Parameters
Estimates (population mean, proportion,
Sample variance…)
mean,
proportion,
variance…
1
15/11/2016
Types of data
Nominal
Qualitative
Variables Ordinal
Discrete
Quantitative
Variables
Continuous
Types of data
Quantitative if they are assoaciated with a numerical value
Example: number of children, age, weight, nb. of workers, share capital
2
15/11/2016
Normal Distribution
area z
90% 1.645
95% 1.960
97.5% 2.326
99% 2.576
99.5% 3,090
99.9% 3.291
99.95% 3.891
99.995% 4.417
3
15/11/2016
Confidence intervals
Hypotheses testing
4
15/11/2016
Types of errors
Real situation H 0 is True H 0 is False
Decision taken
Types of errors
5
15/11/2016
Tests of Hypothesis
Decision rule:
p < α , reject H0
p ≥ α , do not reject H0
p : p-value *
* Sig. (in SPSS)
6
15/11/2016
7
15/11/2016
T-Student Test
• Allows testing hypothesis about mean values of a
quantitative variable, in one or two groups.
• May be applied to
• One sample;
• Two independent samples;
• Two paired samples.
T-Student Test
One sample
Hypothesis :
H0 : µ = µ 0
H1 : µ ≠ µ 0
Test Statistics X −µ
T= ~ t(n−1)
S' / n
8
15/11/2016
T-Student Test
Two independent samples
Hypothesis
H0 : µ1 = µ 2
H1 : µ1 ≠ µ 2 (or µ1 < µ 2 or µ1 > µ 2 )
T-Student Test
Two paired samples
Hypothesis
H0 : µD = 0
H1 : µD ≠ 0 (or µD < 0 or µD > 0)
D − µD
Test Statistics T= ~ tn−1
S'D
n
9
15/11/2016
19
ANOVA
Comparison of means of a numerical variable in two or
more populations, from which random samples were
drawn.
10
15/11/2016
ANOVA
Compares the variance within samples (groups) –
residual variance
with the variance between samples (groups) – variance
explained by the factor.
21
One-way ANOVA
The behaviour of the numerical variable under analysis is
supposedly influenced by just one factor, with k levels.
We have k samples, one for each level of the factor
(ex: one sample for each shop).
Factor levels nj : sample size for
1 2 … k level j
x11 x12 … x1k k
n = ∑ nj
x21 x22 … x2k j=1
… … … …
xn11 xn22 … xnkk
22
11
15/11/2016
k nj
SQT : ∑ ∑ ( x ij − x ) 2
TOTAL j=1 i =1 n-1
23
Hypothesis
H0 : µ1 = µ 2 = K = µk
H1 : ∃i, j µ i ≠ µ j
12
15/11/2016
25
QMF
F= ~ F (2,9) p-value = 0.0071=0,71%
QME
26
13
15/11/2016
- Tukey’s test
- Fisher’s test – Least significance difference
- Only for few comparisons
- Scheffé’s test
- Bonferroni’s test
- Significance level : α' = 1 − k 1 − α
27
Tests of Hypothesis
Comparison of a numerical variable in different populations from
each of which we have a random sample
Two groups :
Independent samples:
Test for comparing mean
t-Student test : Normal populations
or large samples (application of Central Limit Theorem).
Alternative : Non-parametric Mann-Whitney Test
Paired samples:
t-Student test : Normal populations
or large samples (application of Central Limit Theorem).
Alternative : Non-parametric Wilcoxon Test
14
15/11/2016
Tests of Hypothesis
Comparison of a numerical variable in different populations from
each of which we have a random sample
K >2 groups :
Normal populations : Analysis of varaince (ANOVA)
Alternative : : Non-parametric Kruskal-Wallis Test.
Non-parametric tests
• Generally, they are na alternative to parametric tests
when assumptions do not apply (normality and
sample size)
15
15/11/2016
Non-parametric tests
16
15/11/2016
Correlations test
Pearson test (assuming Normal distribution of the
variables)
Tests the linear correlation coefficient
Correlations test
Pearson test: when variables are normal
Tests the linear correlation coefficient
1n n
∑ Xi Yi − XY ∑ Xi Yi − nXY
s XY n i =1 i =1
R= = =
sX × sY 1 n 1 n n n
∑ Xi2 − X 2 ∑ Yi2 − Y 2 ∑ Xi2 − nX2 ∑ Yi2 − nY 2
i =1
n i =1
n i =1 i =1
17
15/11/2016
Correlations test
Spearman - nonparametric test
(there are no assumptions regarding variables distributions)
Tests the ordinal correlation coefficient
Test statistics :
n H0 : variables are not correlated
6 ∑ Di2
H1 : variables are correlated
R S = 1 − i =21
n n −1 ( )
Di = R iX − R iY
Test statistics
18
15/11/2016
Teste T-student It tests if two independent samples have equal means Means in the two Analyze/Compare
Parametric Tests
Quiquadrado It tests if two independent samples have equal Variables are independent Analyze/descriptive statistics
(só para variáveis means/medians /crosstabs
qualitativas)
Non-Parametric Tests
Kolmogorov-Smirnov It tests if a variable follows a particular distribution The variable follows a Analyze/Desc.Stat/Explore ou
particular distribution Analyze/Non parametric tests
Mann-Whitnney It tests if two independent samples have equal Means/medians of the two Analyze/Non parametric
means/medians independent populations tests/independent samples
(assumptions not met) are equal
Wilcoxon It tests if two related samples have equal means/medians Means/medians of the Analyze/Non parametric
(assumptions not met) related populations are tests/related samples
equal
Kruskall-Wallis It tests if three or more independent samples have equal Means of the three or more Analyze/Non parametric
means independent populations tests/independent samples
(assumptions not met) are equal
Practice
Using the file 1991 US General Social Survey, answer
the following questions, defining the hypotheses
and assuming the right assumptions:
1. Are there any significant differences between the
education levels by sex?
2. Are there any significant differences between the
education levels of parents?
3. Are happy and race independent?
4. Are there any significant differences among the age of
respondents by region?
38
19