Difference in difference with multiple treatments

Aayush Bakshi

Join Date: Mar 2022
Posts: 7

Difference in difference with multiple treatments

29 Mar 2022, 18:35

Dear Statalist Users,

I am trying to measure the impact of Uber on the earnings of taxi drivers and would greatly appreciate advice on how best to tackle this on STATA 16.1.

I have the following data:
1) I have pooled cross sectional individual level data on taxi drivers across 5 major US cities across 10 years: Earnings (dependant variable lnincearn) and various individual and city level characteristics (age, gender, citizenship, unemployment)
2) Data on when uber was introduced in their specific city: Dummy variable UBER which takes the value 1 if uber was present in their city in that year and 0 otherwise

I am looking to measure the impact of Uber on their earnings while controlling for individual and city level characteristics and have looked into using xtset/xtreg which says there are too many time values within panel.

Would greatly appreciate suggestions on how I could approach this via STATA.

Below is a snapshot of my data where met2013 is the identifier for the city the individual is in and lincearn is the log of their earnings.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int year long met2013 float(uber lincearn)
2009 12420 0  10.12663
2009 12420 0  9.680344
2009 12420 0 10.308952
2009 12420 0 10.308952
2009 12420 0  9.740969
2009 12420 0  9.423838
2009 12420 0  9.775654
2009 12420 0 10.714417
2009 12420 0 10.596635
2009 12420 0 10.819778
2009 12420 0  9.998797
2009 12420 0 9.2103405
2009 12420 0  9.798127
2009 12420 0  9.425451
2009 12420 0         .
2009 12420 0  9.723164
2009 12420 0  10.12663
2009 12420 0 10.819778
2009 12420 0 10.341743
2009 12420 0   11.0021
2009 12420 0  10.08581
2009 12420 0 10.645425
2009 12420 0         .
2009 12420 0  9.287301
2009 12420 0 10.463103
2009 12420 0  9.159047
2009 12420 0 10.308952
2009 12420 0 10.488493
2009 12420 0 12.538967
2009 12420 0         .
2009 12420 0  8.987197
2009 12420 0   6.39693
2009 12420 0  10.37349
2009 12420 0         .
2009 12420 0  9.903487
2009 12420 0 10.691945
2009 12420 0  5.703783
2009 12420 0         .
2009 12420 0 9.2103405
2009 12420 0         .
2009 12420 0  8.853665
2009 12420 0  10.18112
2009 12420 0  9.746834
2009 12420 0 9.1049795
2009 12420 0 10.060492
2009 12420 0  10.37349
2009 12420 0  8.294049
2009 12420 0   9.92818
2009 12420 0  9.798127
2009 12420 0  7.901007
2009 12420 0         .
2009 12420 0  9.392662
2009 12420 0 11.141862
2009 12420 0  9.169518
2009 12420 0 10.714417
2009 12420 0   11.0021
2009 12420 0 10.491274
2009 12420 0 10.691945
2009 12420 0  8.961879
2009 12420 0  10.12663
2009 12420 0   6.55108
2009 12420 0 10.819778
2009 12420 0         .
2009 12420 0 10.645425
2009 12420 0         .
2009 12420 0 10.308952
2009 12420 0 10.714417
2009 12420 0  8.853665
2009 12420 0         .
2009 12420 0 10.596635
2009 12420 0 10.308952
2009 12420 0  8.764053
2009 12420 0 10.518673
2009 12420 0  9.539644
2009 12420 0 10.203592
2009 12420 0  8.294049
2009 12420 0 10.437053
2009 12420 0 10.645425
2009 12420 0   6.55108
2009 12420 0 10.778956
2009 12420 0  9.903487
2009 12420 0 10.485703
2009 12420 0  9.392662
2009 12420 0 10.819778
2009 12420 0         .
2009 12420 0 10.714417
2009 12420 0   9.11603
2009 12420 0         .
2009 12420 0         .
2009 12420 0  8.699514
2009 12420 0 10.404263
2009 12420 0  10.23996
2009 12420 0   11.0021
2009 12420 0 10.645425
2009 12420 0  8.881836
2009 12420 0  8.517193
2009 12420 0  9.615806
2009 12420 0  9.305651
2009 12420 0  10.16969
2009 12420 0  9.305651
end
label values year year_lbl
label def year_lbl 2009 "2009", modify
label values met2013 met2013_lbl
label def met2013_lbl 12420 "Austin-Round Rock, TX", modify

Kind regards,
Aayush Bakshi

Tags: None

Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#2

29 Mar 2022, 18:48

Let me make sure I'm understanding you- you have 5 treated cities in your analysis, yes? How many cities are there in total?

EDIT: You write that you

have looked into using xtset/xtreg which says there are too many time values within panel.

I don't believe for a moment that Stata literally said this. What did Stata really tell you?

Last edited by Jared Greathouse; 29 Mar 2022, 18:52.
Comment
Aayush Bakshi

Join Date: Mar 2022

Posts: 7
#3

30 Mar 2022, 06:43

Dear Jared,

Thanks for your reply.

I have a total of 5 cities in my analysis: Some were treated at an earlier time while some were treated later. For example, Some cities had Uber in 2011 while others had it in 2015.

Apologies, let me clarify the xtset/xtreg issue
The code I used was:

Code:

xtset met2013 year

where met2013 is the identifier of which city that individual is in

I received the output "repeated time values within panel" from this command.

I am not too sure how to approach this. I can use aggregate values for each city but will lose out on controlling for individual level characteristics so this is not ideal. Or is staggered DiD a strategy?

Thanks,
Aayush
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#4

30 Mar 2022, 06:59

Okay so I have a few thoughts about this. Firstly, you have a very small number of treated units. In fact, all of your units are treated, thus you cannot construct the counterfactual, since you've no units to compare your treated units to in the post policy period. In stats terms, you've no potential outcomes- you observe all units under treatment and have nothing to compare them to.

Assuming you had 20 untreated cities along with your 5 treated ones, imbalances in treatment time can be addressed by commands written by Rios, Chaisemartin and others and Xu and others.

Now for the outcome: maybe I'm wrong or you can give the econometric theory better than me, but I'm not sure why we'd need individual level data here. After all, the policy isn't being applied directly to the individual, it's being applied to the city the individual lives or works in. I suppose we COULD use individual level data, but if this were my problem (which actually it sort of is, I'm doing an uber analysis right now), I would simply aggregate this to the city level, which brings me full circle to the moral of the story:

You need more data. 5 all eventually treated cities wouldn't convince me of parallel trends or similarity on common factors (certainly at the individual level), and it also won't work because you need a group of units that was never exposed to the intervention.

Oh, the reason you're getting repeated time ids is because your data is individual level data. If you and I are both in Atlanta and we xtset on city, the ID for Atlanta will appear twice because we both live there and thus cannot uniquely ID Atlanta. Another reason to do your work at the city level, in my opinion, unless you just wanna make an id for the individual.
Comment
Aayush Bakshi

Join Date: Mar 2022

Posts: 7
#5

30 Mar 2022, 09:59

Thanks for your great advice, I have taken a lot from your comments and built upon this:

1) Most importantly, I have included untreated cities in my analysis which never had Uber within the time frame.

But have two options from this analysis: (taken from https://www.statalist.org/forums/for...fference%C2%A0)

First alternative (at the individual level)

Y_ist= α + βT_st+ a_s+ θ_t+ ε_ist,
i – individual, s – state, t – year
T_st – Whether state s had the treatment by year t

Second alternative (at the state level)

Y_st= α + βT_st+ a_s+ θ_t+ ε_st,

s – state, t – year
Y_{st is the average of dependent variable for all individuals in state s at time t.}
T_st – Whether state s had the treatment by year t

Ideally, for the nature of the paper and as I want to look at the impact of individual characteristics I would like to run it at individual level rather than at state level. Furthermore, I have concerns with standard errors, test statistics, and confidence intervals with state level data.

How would I run this analysis through STATA? what would you recommend?
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#6

30 Mar 2022, 10:45

So, how many untreated units are there? The helpfiles for the commands I've cited will guide you on how to implement it. I'd likely use synthetic controls or some never version of the difference-in-differences commands I mentioned. What concerns you about your SEs, t-stats and CIs with state data?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2207
#7

30 Mar 2022, 19:29

This looks like a staggered intervention with pooled cross sections. You probably want to allow some heterogeneity in the treatment effects by treatment cohort and calendar time; presumably the initial effect of Uber was smaller than the effects later. This is easy to do by defining cohort dummies for the different initial treatment periods. Then, these get interacted with time dummies in the post treatment periods.

I difficult question is computing standard errors. If you condition on the treatment assignment then you can use heteroskedasticity-robust standard errors. But if you want to account for the uncertainty in the "policy" assignment, you should cluster. The problem is, clustering with few treated cities might not work well.

You can collapse the data to city-level panel data but, again, the clustering issue can be problematical. It's worth a try, though.

Here's a link to a paper of mine, along with Stata files, that discusses the panel data case. The issues with pooled cross sections are similar.

https://www.dropbox.com/sh/zj91darud...bgsnxS6Za?dl=0
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#8

30 Mar 2022, 19:46

Then, these get interacted with time dummies in the post treatment periods.

Jeff Wooldridge is this essentially what's going on under the hood with the newer DD estimators like this one?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30191
#9

30 Mar 2022, 20:13

You need more data. 5 all eventually treated cities wouldn't convince me of parallel trends or similarity on common factors (certainly at the individual level), and it also won't work because you need a group of units that was never exposed to the intervention.

Actually, this design, with a treatment being introduced sequentially into different cohorts, is increasingly used in epidemiology, and particularly in clinical-translational research. It is called the stepped-wedge. And even though all participants are ultimately treated, the design can be thought of as defining a sequence of eras. The first era is before the first group gets treated. The second era begins when the first group gets treated. The third era begins when the third group gets treated, etc. The final era is when all groups have been treated. Within each era other than the first and last you have a synchronic comparison between treated and untreated groups, and you also have within each group the within-group comparison of pre- and post- treatment outcomes. These can be combined to provide an estimate of treatment effect. The analysis has to include time indicators and group indicators. This gives at least partial adjustment for secular trends and between-group baseline differences. And usually when we do this in clinical-translational research, we randomize the order in which the groups begin treatment. Evidently in this context randomization is not possible. The question posed here also has a wrinkle in that we usually assume that treatment effects are constant over time, whereas here it is specifically assumed to be otherwise. But, as Jeff Wooldridge has pointed out, this can be handled with some extra interaction terms.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#10

30 Mar 2022, 20:37

If I understand you well, you're essentially saying that (let's say with 3 total units), we have a pre-period for all units (no treated units, era 1 where unit 1 is treated (compared to the other two units), era two where unit two is compared to the other now untreated unit, and era three where all our units are now treated. Clyde Schechter is that about right?

I've actually advocated (well I didn't invent it, I adapted it) a similar approach for synthetic controls, in concert with recent advances in difference-in-differences, as is the case here from what I can tell. I think I'll clarify what I meant by my first comment above- I agree that it's mechanically possible to estimate this. EDIT: My main motivation for suggesting more data, is having more panels makes it more likely that we can have units which are suitable comparison groups. If we have an intervention in Atlanta, sure, I can compare Atlanta to Miami, Charlotte, and Boston, but wouldn't it (usually) be better to have a more representative sample of other units? Of course this isn't always possible, but I just figured with Uber this might be useful.

Last edited by Jared Greathouse; 30 Mar 2022, 20:42.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30191
#11

30 Mar 2022, 21:56

Yes, that's about right. Actually, in era two, units 1 and 2, which are treated, are compared to unit 3 which is still untreated. But otherwise, yes, that's how it works. In each era except first and last, there is a comparison of all treated units with all remaining untreated units. And there is also the within unit comparison of pre- and post-treatment.

And I agree that more units makes for a better design. In the typical clinical-translational application, though, the units have many participants each, but units, being typically medium to large size groups, are difficult to recruit. So you usually end up with a modest number of units.
Comment
Aayush Bakshi

Join Date: Mar 2022

Posts: 7
#12

02 Apr 2022, 13:25

Jeff Wooldridge Thanks for your reply,

I have read through your paper as well as watched your helpful seminar. This is precisely what I am aiming to do with my study, I have a few follow up questions with regards to the stata code:

1) Since I will be using pooled cross sectional data, this will not be Two Way Fixed Effects, is that correct? I will be using something along the lines of

Code:

reg lincearn uber i.year x d2 d3 d4 c.d2#c.x c.d3#c.x c.d4#c.x c.f02#c.x c.f03#c.x c.f04#c.x, vce(cluster id)

This is just example code adapted from your paper but I will be using the reg instead of xtreg and also adding interaction terms between cohorts dummies and post treatment time dummies. The variable Uber will take a value one if the treatment was applied in that city during that year and it will be the coefficient of interest. Am I correct in saying this?

2) In terms of computing standard errors, would you recommend clustering if I were to add more treated/untreated cities?
3) Would this method be possible if all cities were eventually treated within the time period? Or would I require cities that were fully untreated?
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#13

02 Apr 2022, 19:24

I don't understand the one way interaction terms. Why did you use them? I'm not asking rhetorically, by the way, since I've only ever seen two way interactions used in practice.

You have panel data. Pooled cross section, whatever you'd like to call it, you observe the same units over a time period- thus, this is the standard DD setup

Code:

xtreg lincearn uber i.year, vce(cl id)

I wouldn't use this setup myself due to staggered interventions as well as heterogeneous treatment effects.... but this is the start. Regarding the SEs, the econometrics Gods have spoken on this subject, and the Gospel according to Abadie and co sayeth the following:

With fixed effects, one should cluster if either (i) both PCn < 1 (clustering in the sampling) and there is heterogeneity in the treatment effects, or (ii) σ 2 > 0 (clustering in the assignment) and there is heterogeneity in the treatment effects

An econometrician may correct me, but it is certain you have heterogeneous effects, and it is likely the case that you've clustering in the assignment of the intervention.

Clyde and I spoke about all units eventually being treated. Given that I work in public policy by field, I've never encountered a situation where all units are eventually treated, but it's possible to do this in the DD/event study framework. Aayush Bakshi
Comment
Aayush Bakshi

Join Date: Mar 2022

Posts: 7
#14

04 Apr 2022, 16:46

Jared Greathouse, Apologies I meant to say that I would use something along the lines of:

Code:

reg lincearn uber i.year uber#i.year uber#city#i.year, vce(cluster id)

to show the interaction term.

I am not sure if I am mistaken here but since my data surveys different individuals over a time period (and that I want to use individual data to control for individual characteristics), I would not be able to use xtreg in my regression.

Essentially I want to run a regression with the following form (with the additional interaction terms of course):
_lincearnist= α + βUber_st+ sX_i+ a_s+ θ_t+ ε_ist,
i – individual, s – city, t – year
_Uberst – Whether city s had Uber by year t
X_i- vector with individual characteristics

and this would be my setup for a staggered DiD, would this be correct?

Appreciate the help and have learnt a lot about clustering from your message.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#15

04 Apr 2022, 17:21

I still don't get the one way interaction terms. Do you know how to interpret them, even? Again I certainly don't mean to sound mean or rhetorical, I genuinely don't know how to even interpret those. You can still use individual covariates within the context of a DD regression if you want (unless they're time invariant with a fixed effects approach.

Trust me, any model with more than one interaction term is a nightmare to interpret, and you have a one 1-way interaction and one 3-way interaction term. Other than that, I think the main thing to be concerned with is imbalances in event time and heterogeneous treatment effects. You likely need one of Stata's more advanced DD commands to handle these things, unless this is just a class paper and your instructor likely won't know or care.
Comment

Announcement

Difference in difference with multiple treatments

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment