DiD with multiple treatment periods

Luca Toni

Join Date: Jul 2022

Posts: 85
#1

DiD with multiple treatment periods

08 Aug 2024, 06:44

Hi,

I'm trying to study the casual effect of UEFA Financial Fair Play (FFP) rules on the financial sustainability of football clubs. I have dataset of all football clubs that were sanctioned at some point by UEFA (treated). However, football clubs obviously weren't treated (sanctioned) at the same time, but in the different years.
Some football clubs were sanctioned in 2013, some in 2014, and so on. My dependent variable is some measure of financial sustainability that I collect for each club in a specific year. I wanted to use other football clubs in the same league that were not sanctioned as my control group.
I basically want to compare team A from league X, which was sanctioned by UEFA at year Z, to a team B from league X, which was not sanctioned at all, pre and post treatment.
Can DiD be useful here? How can I approach it given the fact that not all units from the treatment group were treated at the same year? I also want to control for the league and for the market value of each football club.

Thanks.
Tags: None
Luca Toni

Join Date: Jul 2022

Posts: 85
#2

08 Aug 2024, 10:03

Can someone help me please?
Comment
George Ford

Join Date: Aug 2014

Posts: 3040
#3

08 Aug 2024, 14:25

lots of options. csdid, jwdid. did_multiplegn_dyn. eventdd.
Comment
Luca Toni

Join Date: Jul 2022

Posts: 85
#4

08 Aug 2024, 15:38

Originally posted by George Ford View Post

lots of options. csdid, jwdid. did_multiplegn_dyn. eventdd.

Hey George,
So I ran csdid but my pre trends test was statistically significant, meaning that parallel trend assumption is violated. It's not surprising me though since my outcome variable is volatile. Maybe I should add more teams for control group? I doubt it will help though. I have 600 obs in treatment group and around the same in control group.
Also, I'm not sure that I understood the estimation correctly. I ran this command:

Code:

csdid OverallBalanceDeficit MarketValue, ivar(team) time(Year) gvar(first_treat) method(ipw)

and another command without the market value covariate. I tried to control for the market value of the specific club, but my question is, what this command actually does behind the scenes? Does it take a specific treated football club A from league X and compare it to respective untreated football club B from the same league X, pre and post treatment of club A? I mean, does it take into account the league level? I ask because I suspect that I didn't estimate it correctly. It's crucial for me to control on the league level and on the market value of the club.

Beside that, I got a significant ATT coefficient, but my concern is the pre trends test.

Last edited by Luca Toni; 08 Aug 2024, 15:43.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#5

08 Aug 2024, 17:03

How many treated units and controls do you have? Or in other words, how many units ever get treated and how many do not?

Edit: and why limit the controls to teams that are in the same league? How do we know there aren't other potentially comparable teams in other leagues (so long at they're on common support for treatment, i.e, there's no team that's impossible to be treated)?

Last edited by Jared Greathouse; 08 Aug 2024, 17:05.
Comment
Luca Toni

Join Date: Jul 2022

Posts: 85
#6

08 Aug 2024, 23:10

Originally posted by Jared Greathouse View Post

How many treated units and controls do you have? Or in other words, how many units ever get treated and how many do not?

Edit: and why limit the controls to teams that are in the same league? How do we know there aren't other potentially comparable teams in other leagues (so long at they're on common support for treatment, i.e, there's no team that's impossible to be treated)?

Hey Jared,

I have 626 observations in the treatment group, which represent 42 football clubs that were treated (sanctioned) at some point between 2013 to 2019. My sample period is 2008-2023.
In the control group I have (so far, I can add more) 552 observations, which represent 35 football clubs that were never treated, between 2008 to 2023.

Keep in mind that not all football teams (both in treatment and control groups) have observations for the entire sample period (some teams were relegated to the second league at some point).

I want to limit the controls to the same league because I want to control for unobservable characteristics within the specific league. Maybe there are some other local regulations or rules that are relevant to a specific league but nor relevant to other league.

I wrote to George my concerns and things I'm not entirely understand about the estimation itself. Would appreciate it if you can answer to it.

Thanks a lot.

Last edited by Luca Toni; 08 Aug 2024, 23:14.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#7

09 Aug 2024, 06:53

So, regarding estimation, it doesn't really surprise me that your trends aren't parallel. You have 6 years of pretreatment data (at least). Of course it depends on when the other units were treated, but getting parallel trends to a volatile outcome series in a short time period will be pretty hard to do, even in we invoke a conditional parallel trends assumption.

One thing you could do is calculate the growth rate from year to year, but this necessarily shaves off one preintervention period, a luxury you do not have.

One thing you could do is use my forward DID command. It uses a forward selection algorithm to select the optimal control group for each unit and uses it to construct its counterfactual, and then averages the ATTs (it also calculates standard errors and CIs).

fdid relies only on outcome data, so you will not be able to use covariates. But, I must warn you: it's no guarantee. If no subset of controls will satisfy DiD's PTA in the usual sense, then you'll need to consider other more flexible methods such as synthetic controls or other methods.

If you're interested, I can send you the paper I've written on it (the current draft for Stata Journal), should you wish. Luca Toni
Comment
Luca Toni

Join Date: Jul 2022

Posts: 85
#8

10 Aug 2024, 02:37

Originally posted by Jared Greathouse View Post

So, regarding estimation, it doesn't really surprise me that your trends aren't parallel. You have 6 years of pretreatment data (at least). Of course it depends on when the other units were treated, but getting parallel trends to a volatile outcome series in a short time period will be pretty hard to do, even in we invoke a conditional parallel trends assumption.

One thing you could do is calculate the growth rate from year to year, but this necessarily shaves off one preintervention period, a luxury you do not have.

One thing you could do is use my forward DID command. It uses a forward selection algorithm to select the optimal control group for each unit and uses it to construct its counterfactual, and then averages the ATTs (it also calculates standard errors and CIs).

fdid relies only on outcome data, so you will not be able to use covariates. But, I must warn you: it's no guarantee. If no subset of controls will satisfy DiD's PTA in the usual sense, then you'll need to consider other more flexible methods such as synthetic controls or other methods.

If you're interested, I can send you the paper I've written on it (the current draft for Stata Journal), should you wish. Luca Toni

Hey Jared Greathouse,

I tried to use the fdid command, but it gave me the error of "The data are not xtset", so I defined panel "xtset team Year", then I run the command again and got the error of "The data must be strongly balanced". My dataset is not balanced, what can I do then?

Also, one thing I want to understand: in the command of csdid, the treatment variable (aka "gvar") was equal to 1 for the first year that the unit was treated and 0 all other years. Here in fdid the treatment variable is equal to 1 for the treatment year and all years afterwards (post-treatment), and is 0 for all the pre-treatment years. Am I right?

Last edited by Luca Toni; 10 Aug 2024, 02:43.
Comment
George Ford

Join Date: Aug 2014

Posts: 3040
#9

10 Aug 2024, 09:06

what happens with eventdd?
Comment
Luca Toni

Join Date: Jul 2022

Posts: 85
#10

10 Aug 2024, 12:11

Originally posted by George Ford View Post

what happens with eventdd?

Sorry, I don't understand it.

As far as I know, event study refers to a situation when you don't have control group and you want to compare one difference - post and pre treatment.

This is unlike DiD, which uses treatment and control groups. I don't understand why should I use event study if I have control group, or if I want to find the casual effect of the treatment.

Event study reminds me RDD when you compare the treatment group above and below the cutoff. I did such thing but I want to figure out how I'm going to use DiD estimation.

My only problem is the PTA, as Jared said, that my dependent variable is volatile and I have a few years before treatment. In such environment, it's hard if not impossible for the PTA to hold. George Ford

Last edited by Luca Toni; 10 Aug 2024, 12:14.
Comment
George Ford

Join Date: Aug 2014

Posts: 3040
#11

10 Aug 2024, 15:27

eventdd is DID. give it a shot.

another issue may be the endogeneity of the treatment. typically, studies that are cited (important) will be replicated.
Comment
George Ford

Join Date: Aug 2014

Posts: 3040
#12

10 Aug 2024, 15:52

This looks promising.

Code:

egen pid = group(paper_id) g t2t = year - rep_year eventdd citations , timevar(t2t) method(hdfe, absorb(pid year) cluster(pid))

Last edited by George Ford; 10 Aug 2024, 16:12.
Comment

George Ford

Join Date: Aug 2014
Posts: 3040

#13

10 Aug 2024, 16:51

This may be of some use.

Code:

egen pid = group(paper_id)
g t2t = year - rep_year

save try_data, replace
eventdd citations , timevar(t2t) method(hdfe, absorb(pid year) cluster(pid)) keepdummies leads(4) lags(10) inrange 

reghdfe citations lead* lag* , absorb(pid year) cluster(pid) 
testparm lead4 lead3 lead2
coefplot , drop(lead10 lead9 lead8 lead7 lead6 lead5 _cons) vertical xline(4, lp(dot)) yline(0)

ppmlhdfe citations lead* lag* , absorb(pid year) cluster(pid) 
testparm lead4 lead3 lead2
coefplot , drop(lead10 lead9 lead8 lead7 lead6 lead5 _cons) vertical xline(4, lp(dot)) yline(0)

Comment

Luca Toni

Join Date: Jul 2022
Posts: 85

#14

11 Aug 2024, 03:53

Originally posted by George Ford View Post

eventdd is DID. give it a shot.

another issue may be the endogeneity of the treatment. typically, studies that are cited (important) will be replicated.

Thank you, George Ford.

Following your advice, I run it and these are my results:

Code:

 
. egen team_id = group(team)

. g t2t = Year - TreatmentYear

. eventdd OverallBalanceDeficit, timevar(t2t) method(hdfe, absorb(team_id Year) cluster (team_id))
(MWFE estimator converged in 5 iterations)
note: lag10 omitted because of collinearity

HDFE Linear regression                            Number of obs   =      1,178
Absorbing 2 HDFE groups                           F(  20,     76) =       1.55
Statistics robust to heteroskedasticity           Prob > F        =     0.0902
                                                  R-squared       =     0.4258
                                                  Adj R-squared   =     0.3661
                                                  Within R-sq.    =     0.0315
Number of clusters (team_id) =         77         Root MSE        =    29.8767

                               (Std. err. adjusted for 77 clusters in team_id)
------------------------------------------------------------------------------
             |               Robust
OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lead11 |   36.33975   23.22445     1.56   0.122    -9.915752    82.59524
      lead10 |    31.0256   21.96698     1.41   0.162    -12.72544    74.77663
       lead9 |   24.49165    20.5195     1.19   0.236    -16.37648    65.35978
       lead8 |   25.99789   17.40088     1.49   0.139    -8.658978    60.65475
       lead7 |   22.28342   14.64098     1.52   0.132    -6.876607    51.44346
       lead6 |   13.49273   13.52903     1.00   0.322    -13.45267    40.43813
       lead5 |   12.52803   10.44202     1.20   0.234     -8.26906    33.32512
       lead4 |   .5172765   7.859251     0.07   0.948    -15.13578    16.17033
       lead3 |   3.996275   6.450263     0.62   0.537    -8.850537    16.84309
       lead2 |  -3.179978   5.150796    -0.62   0.539    -13.43868     7.07872
        lag0 |  -7.138832   4.751749    -1.50   0.137    -16.60276    2.325096
        lag1 |   -12.4355   5.540443    -2.24   0.028    -23.47025   -1.400751
        lag2 |  -20.10649   6.758972    -2.97   0.004    -33.56815   -6.644832
        lag3 |  -21.05852   7.459255    -2.82   0.006    -35.91492   -6.202131
        lag4 |  -20.66857   10.08522    -2.05   0.044    -40.75503   -.5821062
        lag5 |  -22.22665   9.404761    -2.36   0.021    -40.95785   -3.495442
        lag6 |  -29.67976   11.60824    -2.56   0.013    -52.79957   -6.559948
        lag7 |  -25.60667   13.13578    -1.95   0.055    -51.76884    .5555005
        lag8 |  -37.71128   13.24565    -2.85   0.006    -64.09229   -11.33028
        lag9 |  -29.13825   12.50392    -2.33   0.022    -54.04197   -4.234535
       lag10 |          0  (omitted)
       _cons |   12.83826   4.112128     3.12   0.003     4.648251    21.02828
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     team_id |        77          77           0    *|
        Year |        16           1          15     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

. save try_data, replace
file try_data.dta saved

. eventdd OverallBalanceDeficit, timevar(t2t) method(hdfe, absorb(team_id Year) cluster (team_id)) keepdummies leads(4) lags(10) inrange 
(MWFE estimator converged in 5 iterations)
note: lag10 omitted because of collinearity

HDFE Linear regression                            Number of obs   =      1,178
Absorbing 2 HDFE groups                           F(  20,     76) =       1.55
Statistics robust to heteroskedasticity           Prob > F        =     0.0902
                                                  R-squared       =     0.4258
                                                  Adj R-squared   =     0.3661
                                                  Within R-sq.    =     0.0315
Number of clusters (team_id) =         77         Root MSE        =    29.8767

                               (Std. err. adjusted for 77 clusters in team_id)
------------------------------------------------------------------------------
             |               Robust
OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lead11 |   36.33975   23.22445     1.56   0.122    -9.915752    82.59524
      lead10 |    31.0256   21.96698     1.41   0.162    -12.72544    74.77663
       lead9 |   24.49165    20.5195     1.19   0.236    -16.37648    65.35978
       lead8 |   25.99789   17.40088     1.49   0.139    -8.658978    60.65475
       lead7 |   22.28342   14.64098     1.52   0.132    -6.876607    51.44346
       lead6 |   13.49273   13.52903     1.00   0.322    -13.45267    40.43813
       lead5 |   12.52803   10.44202     1.20   0.234     -8.26906    33.32512
       lead4 |   .5172765   7.859251     0.07   0.948    -15.13578    16.17033
       lead3 |   3.996275   6.450263     0.62   0.537    -8.850537    16.84309
       lead2 |  -3.179978   5.150796    -0.62   0.539    -13.43868     7.07872
        lag0 |  -7.138832   4.751749    -1.50   0.137    -16.60276    2.325096
        lag1 |   -12.4355   5.540443    -2.24   0.028    -23.47025   -1.400751
        lag2 |  -20.10649   6.758972    -2.97   0.004    -33.56815   -6.644832
        lag3 |  -21.05852   7.459255    -2.82   0.006    -35.91492   -6.202131
        lag4 |  -20.66857   10.08522    -2.05   0.044    -40.75503   -.5821062
        lag5 |  -22.22665   9.404761    -2.36   0.021    -40.95785   -3.495442
        lag6 |  -29.67976   11.60824    -2.56   0.013    -52.79957   -6.559948
        lag7 |  -25.60667   13.13578    -1.95   0.055    -51.76884    .5555005
        lag8 |  -37.71128   13.24565    -2.85   0.006    -64.09229   -11.33028
        lag9 |  -29.13825   12.50392    -2.33   0.022    -54.04197   -4.234535
       lag10 |          0  (omitted)
       _cons |   12.83826   4.112128     3.12   0.003     4.648251    21.02828
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     team_id |        77          77           0    *|
        Year |        16           1          15     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

. reghdfe OverallBalanceDeficit lead* lag* , absorb(team_id Year) cluster(team_id) 
(MWFE estimator converged in 5 iterations)
note: lag10 omitted because of collinearity

HDFE Linear regression                            Number of obs   =      1,178
Absorbing 2 HDFE groups                           F(  20,     76) =       1.55
Statistics robust to heteroskedasticity           Prob > F        =     0.0902
                                                  R-squared       =     0.4258
                                                  Adj R-squared   =     0.3661
                                                  Within R-sq.    =     0.0315
Number of clusters (team_id) =         77         Root MSE        =    29.8767

                               (Std. err. adjusted for 77 clusters in team_id)
------------------------------------------------------------------------------
             |               Robust
OverallBal~t | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lead11 |   36.33975   23.22445     1.56   0.122    -9.915752    82.59524
      lead10 |    31.0256   21.96698     1.41   0.162    -12.72544    74.77663
       lead9 |   24.49165    20.5195     1.19   0.236    -16.37648    65.35978
       lead8 |   25.99789   17.40088     1.49   0.139    -8.658978    60.65475
       lead7 |   22.28342   14.64098     1.52   0.132    -6.876607    51.44346
       lead6 |   13.49273   13.52903     1.00   0.322    -13.45267    40.43813
       lead5 |   12.52803   10.44202     1.20   0.234     -8.26906    33.32512
       lead4 |   .5172765   7.859251     0.07   0.948    -15.13578    16.17033
       lead3 |   3.996275   6.450263     0.62   0.537    -8.850537    16.84309
       lead2 |  -3.179978   5.150796    -0.62   0.539    -13.43868     7.07872
        lag0 |  -7.138832   4.751749    -1.50   0.137    -16.60276    2.325096
        lag1 |   -12.4355   5.540443    -2.24   0.028    -23.47025   -1.400751
        lag2 |  -20.10649   6.758972    -2.97   0.004    -33.56815   -6.644832
        lag3 |  -21.05852   7.459255    -2.82   0.006    -35.91492   -6.202131
        lag4 |  -20.66857   10.08522    -2.05   0.044    -40.75503   -.5821062
        lag5 |  -22.22665   9.404761    -2.36   0.021    -40.95785   -3.495442
        lag6 |  -29.67976   11.60824    -2.56   0.013    -52.79957   -6.559948
        lag7 |  -25.60667   13.13578    -1.95   0.055    -51.76884    .5555005
        lag8 |  -37.71128   13.24565    -2.85   0.006    -64.09229   -11.33028
        lag9 |  -29.13825   12.50392    -2.33   0.022    -54.04197   -4.234535
       lag10 |          0  (omitted)
       _cons |   12.83826   4.112128     3.12   0.003     4.648251    21.02828
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     team_id |        77          77           0    *|
        Year |        16           1          15     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

. testparm lead4 lead3 lead2

 ( 1)  lead4 = 0
 ( 2)  lead3 = 0
 ( 3)  lead2 = 0

       F(  3,    76) =    1.71
            Prob > F =    0.1718

. coefplot , drop(lead11 lead10 lead9 lead8 lead7 lead6 lead5 _cons) vertical xline(4, lp(dot)) yline(0)

Click image for larger version

Name: figure1.PNG
Views: 1
Size: 16.8 KB
ID: 1761260

Click image for larger version

Name: figure2.PNG
Views: 1
Size: 15.4 KB
ID: 1761261

Click image for larger version

Name: figure3.PNG
Views: 1
Size: 15.3 KB
ID: 1761262

My dependent variable also gets negative values, so I can't use ppmlhdfe command.

Overall, what can I learn from these results? I'm not sure what is the purpose of dropping 5-11 leads. Would like if you can clarify it.

Thanks a lot.

Comment

George Ford

Join Date: Aug 2014

Posts: 3040
#15

11 Aug 2024, 09:01

Hah. I confused your post with another one as far as the data, but this is still useful to you. (Ignore the ppml stuff).

Limiting pre-treatment periods is common, and often falls in the range of 3-5 periods. The last 2 figures are "money" -- zero pre-treatment effects and then negative and significant treatment effects. You could get by with those results.

But, if you look at the first figure, you see the entire effect series has a downward trend. Still, all the pretreatment CIs include 0. You can test them all using reghdfe (I could never get "estat leads" to work after evemtdd; I do not know why).

I'd look at the pre-treatment data carefully to see if something funky is going on when t2g is the range -8, -4. Do certain clubs fall out or re-enter of the sample during that time? That sort of thing.
Comment

Announcement