Testing Parallel Trend Assumption

Roberto Vidri

Join Date: Mar 2019

Posts: 36
#1

Testing Parallel Trend Assumption

09 Nov 2020, 17:11

I'm working on a Difference-in-Difference model. Unfortunately, I have very little experience with these models. I've visually inspected the data as a test for the parallel trend assumption, I believe it holds. However, as this is an analysis that will be peer-reviewed, I'm sure they will require an empirical test.

The visual inspection was done by plotting the means and fitted values for the entire time period 2010-2017, before the intervention (<2014), and after the intervention (>=2014).

I have individual level data for >30,000 subjects. I'm running unadjusted and adjusted DID models utilizing the "diff" command.

Code:

by YEAR_OF_DIAGNOSIS expand, sort: egen pc_uninsured_expand = mean(100*uninsured) *Fitted twoway (lfit pc_uninsured_expand YEAR_OF_DIAGNOSIS if expand==1) /// (scatter pc_uninsured_expand YEAR_OF_DIAGNOSIS if expand==1) /// (lfit pc_uninsured_expand YEAR_OF_DIAGNOSIS if expand==0) /// (scatter pc_uninsured_expand YEAR_OF_DIAGNOSIS if expand==0), /// ylabel (0(5)20) ytitle(Uninsured (%)) /// legend(label(1 "Expansion - Fitted") label(2 "Expansion") label(3 "Nonexpansion fitted") label(4 "Nonexpansion")) xlabel(#8)

1. Plotting of the data over entire time period.

2. Plot of data before implementation of "treatment"

3. Plot of data after implementation of "treatment"

The differences in differences model that I will be running is the following; where the outcome is "uninsured" (1=uninsured, 0=insured), "expand" is my treatment variable (treated=1, not treated=0), and "exp_year" is the grouping variable for before/after 2014 (before=0, after=1).

Code:

diff uninsured, t(expand) p(exp_year) diff uninsured, t(expand) p(exp_year) cov(AGE SEX race_cat hispanic_cat NO_HSD_QUAR_16 MED_INC_QUAR_16)

I'm looking for a simple way of "proving" that the parallel trend assumption holds.

Thanks!
Tags: None
Tom Scott

Join Date: Apr 2019

Posts: 266
#2

09 Nov 2020, 18:51

Roberto Vidri I would never use the term "prove" in the social sciences, but I think you could provide some evidence that the pre-intervention trends do not differ across groups by pooling the data and conducting a mixed effects model, with a binary treatment indicator predicting variation in the outcome's pre-intervention slope. If the coefficient is not significant, you have some evidence that any difference between slopes is not statistically significant.

Code:

mixed uninsured expand##i.year || id: expand if year < 2014, vce(robust) reml

Might code might be slightly off. It's been awhile since I've run one

Last edited by Tom Scott; 09 Nov 2020, 19:08.
2 likes
Comment
Roberto Vidri

Join Date: Mar 2019

Posts: 36
#3

11 Nov 2020, 12:15

Originally posted by Tom Scott View Post

Roberto Vidri I would never use the term "prove" in the social sciences, but I think you could provide some evidence that the pre-intervention trends do not differ across groups by pooling the data and conducting a mixed effects model, with a binary treatment indicator predicting variation in the outcome's pre-intervention slope. If the coefficient is not significant, you have some evidence that any difference between slopes is not statistically significant.

Code:

mixed uninsured expand##i.year || id: expand if year < 2014, vce(robust) reml

Might code might be slightly off. It's been awhile since I've run one

Tom, Thanks for your thoughts!
Comment
Marry Lee

Join Date: Nov 2020

Posts: 186
#4

30 Apr 2021, 12:39

Dear Tom Scott,can you please tell me why did you suggest to use "mixed" instead of probit or logit in this case?

Code:

mixed uninsured expand##i.year || id: expand if year < 2014, vce(robust) reml

Thank you.
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#5

30 Apr 2021, 13:22

Marry Lee I believe because it was a multilevel model with repeated observations nested within individuals
1 like
Comment
Marry Lee

Join Date: Nov 2020

Posts: 186
#6

30 Apr 2021, 13:38

Tom Scott thank you for your quick answer. I exactly don't understand why is it a multilevel model. Shouldn't it be multilevel when there is more than a level of observation (for example individual, school, city)?
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#7

30 Apr 2021, 15:26

Marry Lee multilevel is when any unit is nested within another. So the classic example is students within classrooms within schools. But if you collect the same data on an individual at monthly intervals for 2 years, then you have 24 monthly observations nested within that individual. If you do that for 1000 individuals across 50 cities, then you have 24,000 (24x1000) observations nested within 1000 individuals nested within 50 cities. Then you can look at things like how much variation in an outcome is within individual compared to between individual, or whether the relationship between individual characteristics (e.g., race) and your individual level outcome (e.g., desire to run for political office) depends on city characteristics (e.g., population size).
1 like
Comment

Announcement

Testing Parallel Trend Assumption

Comment

Comment

Comment

Comment

Comment

Comment