QM 7 Panel Regression Fixed Effects
QM 7 Panel Regression Fixed Effects
Examples:
• Household expenditure surveys
• DK field experiments (within and between sessions)
• Registry data at Statistics Denmark
Notation for panel data
A double subscript distinguishes entities (e.g., subjects) and time
periods
i = entity, n = number of entities,
so i = 1,…,n
t = time period, T = number of time periods
so t = 1,…,T
Data: Suppose we have 1 regressor. The data are
(Xi,t; Yi,t), i = 1,…,n, t = 1,…,T
Panel data notation, ctd.
Panel data with k regressors:
(X1,i,t; X2,i,t;…;Xk,i,t; Yi,t), i = 1,…,n, t = 1,…,T
n = number of entities (e.g., subjects in an experiment)
T = number of time periods (e.g., tasks within or between an
experiment)
Some jargon…
• Another term for panel data is longitudinal data
• balanced panel: no missing observations, all variables are
observed for all entities (states) and all time periods (years)
Why are panel data useful?
Variables:
• Traffic fatality rate (number of traffic deaths per year in a state, per
10,000 state residents)
• Tax on a case of beer
• Other factors (legal driving age, drink driving laws, etc.)
Example: data for 1982
• Linear regression model, 1982 data
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | .1484603 .1883682 0.79 0.435 -.2307051 .5276258
_cons | 2.010381 .1390785 14.46 0.000 1.730431 2.290332
------------------------------------------------------------------------------
Traffic fatalities in 1982
4
Fatalities per 10,000 residents
2 1 3
0 1 2 3
Beertax
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | .4387546 .1644538 2.67 0.011 .1077262 .769783
_cons | 1.859073 .1059887 17.54 0.000 1.645729 2.072417
------------------------------------------------------------------------------
Traffic fatalities in 1988
3.5 3
Fatalities per 10,000 residents
1.5 2 1 2.5
0 .5 1 1.5 2
Beertax
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | .3646054 .0621698 5.86 0.000 .2423117 .4868992
_cons | 1.853308 .0435671 42.54 0.000 1.767607 1.939008
------------------------------------------------------------------------------
Traffic fatalities in 1982-88
4
Fatalities per 10,000 residents
2 1 3
0 1 2 3
Beertax
• Math:
• Consider fatality rates in 1988 and 1982:
FatalityRatei,88 = β0 + β1BeerTaxi,88 + β2Zi + ui,88
FatalityRatei,82 = β0 + β1BeerTaxi,82 + β2Zi + ui,82
------------------------------------------------------------------------------
dvfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dbtax | -.8689216 .3929877 -2.21 0.032 -1.659511 -.0783323
------------------------------------------------------------------------------
Example: data for 1982 and 1988
• Fixed effects model, 1982 and 1988 data, with constant
------------------------------------------------------------------------------
dvfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dbtax | -1.040973 .4172279 -2.49 0.016 -1.880809 -.2011364
_cons | -.0720371 .060644 -1.19 0.241 -.1941072 .050033
------------------------------------------------------------------------------
Change in traffic fatalities in 1982-88
1
Change in fatalities per 10,000 residents
-1 -.5 -1.5 0 .5
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | -2.563516 2.535592 -1.01 0.326 -7.913146 2.786115
CA | 2.151935 .252372 8.53 0.000 1.619477 2.684394
TX | 3.387019 1.100874 3.08 0.007 1.064377 5.709661
MA | 1.857948 .654464 2.84 0.011 .4771501 3.238747
------------------------------------------------------------------------------
Example: regression line for each state
Predicted fatality rates across states
4
2
Fatality rate
-2 -4
-6 0
0 1 2 3
Beertax
CA TX
MA
Fixed effects regression
Look at the regression lines for all three states:
FRCA,t = αCA + βBeerTaxBeerTaxCA,t + uCA,t
FRTX,t = αTX + βBeerTaxBeerTaxTX,t + uTX,t
FRMA,t = αMA + βBeerTaxBeerTaxMA,t + uMA,t
F(1,17) = 1.02
corr(u_i, Xb) = -0.7743 Prob > F = 0.3262
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | -2.563516 2.535592 -1.01 0.326 -7.913146 2.786115
_cons | 2.465634 .6659065 3.70 0.002 1.060694 3.870574
-------------+----------------------------------------------------------------
sigma_u | .81136868
sigma_e | .16784624
rho | .95896182 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(2, 17) = 65.50 Prob > F = 0.0000
Regression with time fixed effects
• Time fixed effects
• An omitted variable may vary over time but not across states
• For example, safer cars; changes in national laws, etc.
• These factors produce intercepts that change over time
• Time fixed effects are written in binary regressor form
• Let y83, y84,…, y88 denote binary indicators for each year
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | .3663358 .0626 5.85 0.000 .2431877 .4894839
y83 | -.0820359 .1116734 -0.73 0.463 -.3017224 .1376506
y84 | -.0717331 .1116734 -0.64 0.521 -.2914195 .1479533
y85 | -.1105458 .1116765 -0.99 0.323 -.3302383 .1091467
y86 | -.0161185 .1116815 -0.14 0.885 -.2358209 .203584
y87 | -.0155355 .111695 -0.14 0.889 -.2352645 .2041935
y88 | -.0010271 .111718 -0.01 0.993 -.2208014 .2187471
_cons | 1.894848 .0856585 22.12 0.000 1.726338 2.063357
------------------------------------------------------------------------------
Example: State and time fixed effects model
Fixed-effects (within) regression Number of obs = 336
Group variable: state Number of groups = 48
F(7,281) = 3.50
corr(u_i, Xb) = -0.6781 Prob > F = 0.0013
------------------------------------------------------------------------------
vfrall | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
beertax | -.6399799 .1973768 -3.24 0.001 -1.028505 -.2514551
y83 | -.0799029 .0383537 -2.08 0.038 -.1554 -.0044058
y84 | -.0724206 .0383517 -1.89 0.060 -.1479136 .0030725
y85 | -.1239763 .0384418 -3.23 0.001 -.1996468 -.0483058
y86 | -.0378645 .0385879 -0.98 0.327 -.1138225 .0380936
y87 | -.0509021 .0389737 -1.31 0.193 -.1276196 .0258155
y88 | -.0518038 .0396235 -1.31 0.192 -.1298003 .0261927
_cons | 2.42847 .1081198 22.46 0.000 2.215643 2.641298
-------------+----------------------------------------------------------------
sigma_u | .70945965
sigma_e | .18788295
rho | .93446372 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(47, 281) = 53.19 Prob > F = 0.0000
Exercises
• Estimate a fixed effects model (-xtreg-) of fatality rates
on beertax using data from 1982 and 1983
• What is the marginal effect of beertax on fatality rates?
• Is the marginal effect of beertax on fatality rates
significantly different from 0?
• Add a time indicator for 1983 to the model, do fatality rates
tend to increase or decrease over time?
• Is the time indicator significantly different from 0?
Summary
• Learning outcomes
• Understand what we mean by panel data
• Understand the distinction between entity and time fixed
effects
• Understand the underlying assumptions of fixed effects
models
• Estimate fixed effects models in Stata
Extra exercises
• Use the Risk_Panel_Balanced.dta dataset (Andersen et al. [2008])
• Generate dummy variables for each of the four risk aversion tasks and call
them task_1, task_2, task_3, and task_4
• Run an OLS regression model of crra on dummy variables for all four risk
aversion tasks
• What are the predicted crra-values for the four risk aversion tasks?
• Are the estimated coefficients for the four risk aversion tasks similar?
• Repeat the same two exercises with observations for the first and second
experiment, respectively (condition the regression on repeat=0, and repeat=1)
• Generate a new variable crra_diff that measures the difference in crra values
between the two experiments (i.e., crra when repeat=1 minus crra when
repeat=0)
• Run an OLS regression model of crra_diff on dummy variables for all four risk
aversion tasks
• What are the predicted crra-values for the four risk aversion tasks?
• Are the estimated coefficients for the four risk aversion tasks similar?
Extra exercises
• Continue using Risk_Panel_Balanced.dta
• Define the panel data structure (xtset id)
• Run a fixed effects (FE) model of crra on repeat for each risk
aversion task
• Are the estimated coefficients for repeat significantly different from 0?
• Compare the estimated coefficients for repeat to the results from an
OLS model with similar dependent and independent variables? Are the
estimated coefficients different?
• Add dummy variables for task 2, task 3, and task 4 to the FE and OLS
models. Are the estimated coefficients for repeat and the task identifiers
different across the two types of models?
• Identify and keep subjects who provided 8 non-missing responses.
• Run FE and OLS models of crra on repeat and dummy variables for
task 2, task 3, and task 4. Are the estimated coefficients for repeat and
the task identifiers different across the two types of models?