0% found this document useful (0 votes)
67 views

Parametric and Non-Parametric Statistical Testing

This document discusses various statistical tests and concepts, including: - Parametric tests like the t-test and ANOVA assume the data comes from a normal distribution, while non-parametric tests do not have this assumption. - The t-test can be used to test differences between one sample or between two independent or paired samples. - ANOVA allows comparison of means across more than two groups and tests if the group means are all equal. It compares within-group and between-group variance. - Type I and II errors are discussed in the context of hypothesis testing. The significance level trades off the two error probabilities.

Uploaded by

Denise Maciel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Parametric and Non-Parametric Statistical Testing

This document discusses various statistical tests and concepts, including: - Parametric tests like the t-test and ANOVA assume the data comes from a normal distribution, while non-parametric tests do not have this assumption. - The t-test can be used to test differences between one sample or between two independent or paired samples. - ANOVA allows comparison of means across more than two groups and tests if the group means are all equal. It compares within-group and between-group variance. - Type I and II errors are discussed in the context of hypothesis testing. The significance level trades off the two error probabilities.

Uploaded by

Denise Maciel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

15/11/2016

Work Plan
Parametric and non-parametric statistical testing

2016/2017
Pedro Campos / Paula Brito

Inference

Parameters
Estimates (population mean, proportion,
Sample variance…)
mean,
proportion,
variance…

Source: HowMed http://howmed.net/community-medicine/tests-of-statistical-significance/


2

1
15/11/2016

Types of data

Nominal
Qualitative
Variables Ordinal

Discrete

Quantitative
Variables
Continuous

Types of data
Quantitative if they are assoaciated with a numerical value
Example: number of children, age, weight, nb. of workers, share capital

Qualitative: if they are NOT associated with a numerical value


The different modalities (or categories) may be represented by a code.
Example: gender, profession, marital state, nationality, age class

Qualitative Ordinal: if the modalities (categories) have an intrinsic order


Example: education level, age class, specialization level of a worker

Qualitative Nominal: if the modalities do NOT have an intrinsic order


Example: gender, profession, marital state, nationality

2
15/11/2016

What is a statistical distribution?


• Some models, by their nature represent
typical random phenomena.
• Some exemples of typical distributions...
– Normal distribution
– Exponencial
– Uniform
– Poisson
– Bernoulli

Normal Distribution
area z
90% 1.645
95% 1.960
97.5% 2.326
99% 2.576
99.5% 3,090
99.9% 3.291
99.95% 3.891
99.995% 4.417

Continuous distribution associated with different random phenomena.


This is the most important distribution due to its properties related to higher
regularity of central values.

3
15/11/2016

Confidence intervals

Hypotheses testing

Two hypotheses: H0 and H1


of which only one is true

Test: procedure to decide between the two


hypothesis, given the data from a random
sample

4
15/11/2016

Types of errors
Real situation H 0 is True H 0 is False
Decision taken

Reject H0 Error type I Correct decision

Do not reject H0 Correct decision Error type II

α = P(Error type I) = P (Reject H0/H0 true)


α is the significance level

β = P(Error type II) = P (Not reject H0/H1 true)


η= 1-β is the power of the test

* Sig. (no SPSS)

Types of errors

• Reduction of α leads a to an increase of β.


• It is not possible to optimize the two error
probabilities.

• Tradeoff solution: fix α (current value: 5% or 1%)


(it corresponds to controling first the type I error)

* Sig. (no SPSS)

5
15/11/2016

Tests of Hypothesis

Decision rule:

 p < α , reject H0

p ≥ α , do not reject H0

p : p-value *
* Sig. (in SPSS)

Steps for hypotheses testing


• Formulation of H0 and H1
• Determining the decision variable (with an associated
distribution) - test statistic
• Calculation of experimental value of the test statistic
• Determination of p-value as a function of H1 (right, left, ...)
• Conclusion: rejection or not of H0
• Calculation (possible) of the test power 1-β

6
15/11/2016

Parametric vs non parametric


statistical tests
• For practical purposes, you can think of "parametric"
as referring to tests, such as t-tests and the analysis
of variance, that assume the underlying source
population(s) to have a known distribution, often to
be normally distributed.
• They generally also assume that one's measures
derive from an equal-interval scale.

• And you can think of "non-parametric" as referring to


tests that do not make on these particular
assumptions.

Parametric vs non parametric


statistical tests
Examples of parametric tests include
• T-Student test
• ANOVA
• Test for proportions

Examples of non-parametric tests include


• Chi-square tests,
• Mann-Whitney Test
• Wilcoxon Signed-Rank Test
• the Kruskal-Wallis Test
• the Friedman Test

7
15/11/2016

T-Student Test
• Allows testing hypothesis about mean values of a
quantitative variable, in one or two groups.

• For small samples with less than 30 observations, it is


required that the variables follow a Normal distribution in
each group (verification : K-S test).

• May be applied to
• One sample;
• Two independent samples;
• Two paired samples.

T-Student Test
One sample

Hypothesis :
H0 : µ = µ 0

H1 : µ ≠ µ 0

In SPSS: compare means/ one sample T Test

Test Statistics X −µ
T= ~ t(n−1)
S' / n

8
15/11/2016

T-Student Test
Two independent samples
Hypothesis

H0 : µ1 = µ 2
H1 : µ1 ≠ µ 2 (or µ1 < µ 2 or µ1 > µ 2 )

In SPSS: compare means/ independent samples T Test

Test Statistics (X − X ) − (µ1 − µ2 )


T= 1 2 ~ t(n1 +n2 −2)
1 1
(equal variances assumed) Sp +
n1 n2

T-Student Test
Two paired samples
Hypothesis
H0 : µD = 0
H1 : µD ≠ 0 (or µD < 0 or µD > 0)

In SPSS: compare means/ paired samples T Test

D − µD
Test Statistics T= ~ tn−1
S'D
n

9
15/11/2016

What about more than two samples?


→ For 2 samples : t-Student test

For more than 2 samples :


To test the k samples 2 by 2 : bad option !

k = nb. of populations to compare


Probability of Type I error (probability of rejecting the
null hypothesis when it is true) = 1- (1- α)k
If k = 4, for α = 5% → error = 18,5% !!!!

19

ANOVA
Comparison of means of a numerical variable in two or
more populations, from which random samples were
drawn.

Example : compare the mean value of the sales of a


given product in different shops.

Hypothesis : it is assumed that the numerical variable


under analysis follows a Normal distribution, and that
the variances in the populations to compare are equal.

10
15/11/2016

ANOVA
Compares the variance within samples (groups) –
residual variance
with the variance between samples (groups) – variance
explained by the factor.

If the residual variance is very low as compares to the


explained variance – due to the factor – we may
conclude that the mean values of the variable under
study are different between groups.

21

One-way ANOVA
The behaviour of the numerical variable under analysis is
supposedly influenced by just one factor, with k levels.
We have k samples, one for each level of the factor
(ex: one sample for each shop).
Factor levels nj : sample size for
1 2 … k level j
x11 x12 … x1k k
n = ∑ nj
x21 x22 … x2k j=1
… … … …
xn11 xn22 … xnkk
22

11
15/11/2016

One way ANOVA


Source of Sums of squares Degrees Mean F
variation of squares
freedom
k SQF
SQF : ∑ n j ( x j − x ) 2 QMF = QMF
FACTOR j=1 k-1 k −1 F=
QME
k nj SQE
SQE : ∑ ∑ ( x ij − x j ) 2 QME =
RESIDUAL j=1 i =1 n-k n−k

k nj
SQT : ∑ ∑ ( x ij − x ) 2
TOTAL j=1 i =1 n-1

23

One way ANOVA


So, when we have more than two independent samples

Hypothesis

H0 : µ1 = µ 2 = K = µk
H1 : ∃i, j µ i ≠ µ j

In SPSS: compare means/ one way Anova

Test Statistics SQF


F = k − 1 ~ F(k −1,n−k)
SQE
n−k

12
15/11/2016

One way ANOVA - Example


A product is sold in 3 different shops, A, B and C
A sample of weekly sales values was observed in each shop :
A B C
75 74 60
70 78 64
66 72 65
69 68 55

Assuming the usual hypothesis of analysis of variance, test the


hypothesis that the shop has no effect in the sales.

25

One-way ANOVA - Example


Source of Sum of squares DF Mean F
variability squares
FACTOR SQF=312 k-1
312
2 QMF =
2 Fobs = 312 / 2 = 9
156 / 9
RESIDUAL SQE=156 n-k
156
9 QME =
9
TOTAL SQT = 468 n-1
11

QMF
F= ~ F (2,9) p-value = 0.0071=0,71%
QME
26

13
15/11/2016

One way ANOVA


Multiple comparison of mean values
If ANOVA rejects the null hypothesis (that the mean values are
equal across populations), we wish to know which mean pairs are
different.
→ Post.Hoc Tests for multiple comparison of mean values

- Tukey’s test
- Fisher’s test – Least significance difference
- Only for few comparisons
- Scheffé’s test
- Bonferroni’s test
- Significance level : α' = 1 − k 1 − α

27

Tests of Hypothesis
Comparison of a numerical variable in different populations from
each of which we have a random sample
Two groups :
Independent samples:
Test for comparing mean
t-Student test : Normal populations
or large samples (application of Central Limit Theorem).
Alternative : Non-parametric Mann-Whitney Test
Paired samples:
t-Student test : Normal populations
or large samples (application of Central Limit Theorem).
Alternative : Non-parametric Wilcoxon Test

14
15/11/2016

Tests of Hypothesis
Comparison of a numerical variable in different populations from
each of which we have a random sample

K >2 groups :
Normal populations : Analysis of varaince (ANOVA)
Alternative : : Non-parametric Kruskal-Wallis Test.

Non-parametric tests
• Generally, they are na alternative to parametric tests
when assumptions do not apply (normality and
sample size)

• Parametric tests are usually more powerful than


nonparametric. However, they are a fair alternative
for small size samples.

15
15/11/2016

Non-parametric tests

Hypotheses tests for variable association


(involving pairs of variables – related samples)

Quantitative variables (numerical) :


Correlation tests

Qualitative variables (categorical) :


Chi-square independence tests

16
15/11/2016

Correlations test
Pearson test (assuming Normal distribution of the
variables)
Tests the linear correlation coefficient

Spearman test (there are no assumptions regarding


variables distributions)
Tests the ordinal correlation coefficient

Correlations test
Pearson test: when variables are normal
Tests the linear correlation coefficient

1n n
∑ Xi Yi − XY ∑ Xi Yi − nXY
s XY n i =1 i =1
R= = =
sX × sY 1 n  1 n   n  n 
 ∑ Xi2 − X 2  ∑ Yi2 − Y 2   ∑ Xi2 − nX2  ∑ Yi2 − nY 2 
 i =1
n  i =1
n   i =1  i =1 

H0 : variables are not linearly correlated


H1 : variables are linearly correlated
n−2
Test statistics : T =R ~ t(n−2 )
1− R2

17
15/11/2016

Correlations test
Spearman - nonparametric test
(there are no assumptions regarding variables distributions)
Tests the ordinal correlation coefficient

Test statistics :
n H0 : variables are not correlated
6 ∑ Di2
H1 : variables are correlated
R S = 1 − i =21
n n −1 ( )
Di = R iX − R iY

Chis-quare Independence test


Analysis of contingency tables

H0 : variables are independent


H1 : variables are not independent
Compares observed
B1 B2 ... Bs totals frequencies nij with the
A1 n11 n12 ... n1s n1.
expected ones under the
independence hypothesis:
A2 n21 n22 ... n2s n2.
... ... ... ... ... ...
Ar nr1 nr2 ... nrs nr.
totals n.1 n.2 ... n.s n

Test statistics

Providing no eij <1, and no more than 20% eij’s < 5

18
15/11/2016

Common Statistical Tests


Test name Objective Null hypothesis Where in SPSS?
(H0)
Teste T-Student It tests if the mean of a sample greater than (lower than or Mean is equal to c Analyze/compare means
equal to) a value c /one sample T test

Teste T-student It tests if two independent samples have equal means Means in the two Analyze/Compare
Parametric Tests

independent populations means/independent samples


are equal T test
Teste T-student It tests if two related samples have equal means Mean of differences in the Analyze/Compare
two populations is zero means/paired samples T test
ANOVA It tests if three or more independent samples have equal Means of the three or more Analyze/Compare
means independent populations means/ANOVA
are equal

Quiquadrado It tests if two independent samples have equal Variables are independent Analyze/descriptive statistics
(só para variáveis means/medians /crosstabs
qualitativas)
Non-Parametric Tests

Kolmogorov-Smirnov It tests if a variable follows a particular distribution The variable follows a Analyze/Desc.Stat/Explore ou
particular distribution Analyze/Non parametric tests

Mann-Whitnney It tests if two independent samples have equal Means/medians of the two Analyze/Non parametric
means/medians independent populations tests/independent samples
(assumptions not met) are equal

Wilcoxon It tests if two related samples have equal means/medians Means/medians of the Analyze/Non parametric
(assumptions not met) related populations are tests/related samples
equal

Kruskall-Wallis It tests if three or more independent samples have equal Means of the three or more Analyze/Non parametric
means independent populations tests/independent samples
(assumptions not met) are equal

Practice
Using the file 1991 US General Social Survey, answer
the following questions, defining the hypotheses
and assuming the right assumptions:
1. Are there any significant differences between the
education levels by sex?
2. Are there any significant differences between the
education levels of parents?
3. Are happy and race independent?
4. Are there any significant differences among the age of
respondents by region?

38

19

You might also like