DiD approach with treatment group as independent variable? Or comparing coefficients across models?

Johannes de Ruig

Join Date: Jun 2024
Posts: 6

DiD approach with treatment group as independent variable? Or comparing coefficients across models?

18 Jun 2024, 05:56

Hi all,

I have data of household donations to different kind of charities (see below).

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int(id year) byte(hmcan hmchn hmhln hmian)
 1 2012 0 0 1 0
 1 2014 0 0 0 0
 2 2002 0 1 1 0
 3 2002 1 0 1 1
 3 2004 1 0 1 1
 4 2006 0 0 1 0
 4 2008 0 0 1 1
 4 2010 0 1 1 0
 5 2014 0 0 0 0
 5 2016 0 0 0 0
 6 2012 0 1 1 1
 7 2002 1 0 1 0
 8 2002 1 1 1 1
 8 2004 0 1 1 0
 8 2006 0 1 1 1
 8 2008 0 1 1 1
 8 2010 0 1 1 1
 8 2012 0 1 1 0
 8 2014 0 1 1 0
 8 2016 0 1 1 0
 9 2002 1 0 1 1
 9 2004 0 0 1 1
 9 2006 0 0 1 1
10 2004 0 0 1 1
11 2008 0 0 1 1
11 2010 0 0 0 1
11 2012 1 1 1 1
11 2014 0 0 1 1
11 2016 0 0 1 0
11 2019 0 0 1 0
12 2019 0 0 1 0
13 2008 0 1 1 1
14 2008 0 0 1 0
14 2010 0 0 0 0
14 2012 0 0 0 0
15 2002 0 1 1 0
16 2008 0 0 1 1
16 2010 1 0 1 1
16 2012 1 0 1 1
17 2002 0 1 1 1
18 2014 0 0 1 1
19 2002 0 1 1 1
20 2006 0 0 1 1
21 2016 0 0 1 1
21 2019 0 0 1 1
22 2006 0 1 1 1
22 2008 0 1 1 1
23 2019 0 0 1 1
23 2021 0 0 1 1
24 2002 0 1 1 0
25 2012 1 0 1 1
26 2010 0 1 1 1
26 2012 0 0 1 1
27 2012 0 0 1 1
27 2014 0 0 1 0
27 2016 0 0 1 1
27 2019 0 0 1 1
28 2008 0 1 1 1
28 2010 0 1 0 0
28 2012 0 1 1 1
29 2010 0 0 1 1
29 2012 0 0 1 1
29 2014 0 0 1 1
29 2016 0 0 1 1
29 2019 0 0 1 1
29 2021 0 0 1 1
30 2002 0 0 1 0
30 2004 0 0 1 0
30 2008 0 0 1 0
30 2010 0 0 1 0
30 2012 0 0 1 0
31 2004 0 1 1 1
32 2002 0 1 1 1
33 2008 0 0 1 0
34 2019 0 0 1 1
35 2006 0 1 1 0
36 2012 0 1 0 1
37 2006 0 0 1 0
38 2006 1 1 1 1
39 2008 0 0 1 0
39 2010 0 0 1 0
39 2012 0 0 1 0
40 2012 0 1 1 0
41 2002 0 1 1 1
41 2004 1 1 1 0
41 2006 0 0 1 0
41 2008 0 0 0 0
41 2010 0 0 1 0
41 2012 0 0 0 0
41 2014 0 0 1 0
41 2016 0 0 1 0
41 2019 0 0 1 0
42 2002 0 1 1 1
42 2004 0 1 1 0
43 2002 0 1 1 0
43 2004 0 1 1 1
43 2006 0 1 1 1
43 2008 0 0 1 1
44 2006 0 0 1 1
44 2008 0 0 1 0
end
label values id labels0
label values hmcan labels144
label def labels144 0 "no", modify
label def labels144 1 "yes", modify
label values hmchn labels135
label def labels135 0 "no", modify
label def labels135 1 "yes", modify
label values hmhln labels136
label def labels136 0 "no", modify
label def labels136 1 "yes", modify
label values hmian labels137
label def labels137 0 "no", modify
label def labels137 1 "yes", modify

In the data example, I included the household ID, the year, and 4 of the different kind of charities.
For each different kind, it is noted whether or not the household made a donation.
For one of these kind of charities, cultural charities (hmchn), the tax benefits changed in 2010
I want to find out if this tax reform has a significant effect on cultural donations by comparing the cultural donations after 2010 to the donations to the other charities after 2010 (on which the tax reform should have no impact).
It seems to me that a DiD approach would make sense, but the problem is that the treatment happened for all households. There is no control group in that sense.
However, there is the control group consisting of the other kinds of charities.

A different method I tried was by creating different models and comparing the coefficients:

Code:

gen after2010 = year>=2010
reg hmcan after2010
est store hmcan
reg hmchn after2010
est store hmchn
suest hmcan hmchn
lincom [hmcan_mean]:after2010 - [hmchn_mean]:after2010

And then doing this for all different charities
Or comparing it with all other donations:

Code:

gen otherdonations = max(hmchn, hmhln, hmian)
reg otherdonations after2010
est store otherdonations
suest hmcan otherdonations
lincom [hmcan_mean]:after2010 - [otherdonations_mean]:after2010

Although I do get a coefficient and a p-value this way, it feels like there are reasons why I am not allowed to do it simply like this.

Is there a way that I can use a DiD approach on this problem? Or do I need a different approach altogether?

I am using Stata 17 on Windows

Last edited by Johannes de Ruig; 18 Jun 2024, 06:02.

Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

18 Jun 2024, 09:35

You could -reshape- the data to long layout, making separate observations for each charity_type instead of separate variables. Then you could do a DID analysis like this:

Code:

reshape long hm, i(id year) j(ct) string
label define charity_type 1    "chn" //    SET CHN AS THE LOWEST VALUE
encode ct, gen(charity_type) label(charity_type)
label list charity_type

gen byte pre_post = year >= 2010
rename hm donated

table (charity_type) (pre_post), statistic(fvpercent 1.donated) ///
    sformat("%s%%" fvpercent)  nototals

xtset id
xtreg donated i.charity_type##i.pre_post i.year, fe
testparm i.charity_type#1.pre_post

Announcement

DiD approach with treatment group as independent variable? Or comparing coefficients across models?

Comment