Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • DiD approach with treatment group as independent variable? Or comparing coefficients across models?

    Hi all,

    I have data of household donations to different kind of charities (see below).

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(id year) byte(hmcan hmchn hmhln hmian)
     1 2012 0 0 1 0
     1 2014 0 0 0 0
     2 2002 0 1 1 0
     3 2002 1 0 1 1
     3 2004 1 0 1 1
     4 2006 0 0 1 0
     4 2008 0 0 1 1
     4 2010 0 1 1 0
     5 2014 0 0 0 0
     5 2016 0 0 0 0
     6 2012 0 1 1 1
     7 2002 1 0 1 0
     8 2002 1 1 1 1
     8 2004 0 1 1 0
     8 2006 0 1 1 1
     8 2008 0 1 1 1
     8 2010 0 1 1 1
     8 2012 0 1 1 0
     8 2014 0 1 1 0
     8 2016 0 1 1 0
     9 2002 1 0 1 1
     9 2004 0 0 1 1
     9 2006 0 0 1 1
    10 2004 0 0 1 1
    11 2008 0 0 1 1
    11 2010 0 0 0 1
    11 2012 1 1 1 1
    11 2014 0 0 1 1
    11 2016 0 0 1 0
    11 2019 0 0 1 0
    12 2019 0 0 1 0
    13 2008 0 1 1 1
    14 2008 0 0 1 0
    14 2010 0 0 0 0
    14 2012 0 0 0 0
    15 2002 0 1 1 0
    16 2008 0 0 1 1
    16 2010 1 0 1 1
    16 2012 1 0 1 1
    17 2002 0 1 1 1
    18 2014 0 0 1 1
    19 2002 0 1 1 1
    20 2006 0 0 1 1
    21 2016 0 0 1 1
    21 2019 0 0 1 1
    22 2006 0 1 1 1
    22 2008 0 1 1 1
    23 2019 0 0 1 1
    23 2021 0 0 1 1
    24 2002 0 1 1 0
    25 2012 1 0 1 1
    26 2010 0 1 1 1
    26 2012 0 0 1 1
    27 2012 0 0 1 1
    27 2014 0 0 1 0
    27 2016 0 0 1 1
    27 2019 0 0 1 1
    28 2008 0 1 1 1
    28 2010 0 1 0 0
    28 2012 0 1 1 1
    29 2010 0 0 1 1
    29 2012 0 0 1 1
    29 2014 0 0 1 1
    29 2016 0 0 1 1
    29 2019 0 0 1 1
    29 2021 0 0 1 1
    30 2002 0 0 1 0
    30 2004 0 0 1 0
    30 2008 0 0 1 0
    30 2010 0 0 1 0
    30 2012 0 0 1 0
    31 2004 0 1 1 1
    32 2002 0 1 1 1
    33 2008 0 0 1 0
    34 2019 0 0 1 1
    35 2006 0 1 1 0
    36 2012 0 1 0 1
    37 2006 0 0 1 0
    38 2006 1 1 1 1
    39 2008 0 0 1 0
    39 2010 0 0 1 0
    39 2012 0 0 1 0
    40 2012 0 1 1 0
    41 2002 0 1 1 1
    41 2004 1 1 1 0
    41 2006 0 0 1 0
    41 2008 0 0 0 0
    41 2010 0 0 1 0
    41 2012 0 0 0 0
    41 2014 0 0 1 0
    41 2016 0 0 1 0
    41 2019 0 0 1 0
    42 2002 0 1 1 1
    42 2004 0 1 1 0
    43 2002 0 1 1 0
    43 2004 0 1 1 1
    43 2006 0 1 1 1
    43 2008 0 0 1 1
    44 2006 0 0 1 1
    44 2008 0 0 1 0
    end
    label values id labels0
    label values hmcan labels144
    label def labels144 0 "no", modify
    label def labels144 1 "yes", modify
    label values hmchn labels135
    label def labels135 0 "no", modify
    label def labels135 1 "yes", modify
    label values hmhln labels136
    label def labels136 0 "no", modify
    label def labels136 1 "yes", modify
    label values hmian labels137
    label def labels137 0 "no", modify
    label def labels137 1 "yes", modify
    In the data example, I included the household ID, the year, and 4 of the different kind of charities.
    For each different kind, it is noted whether or not the household made a donation.
    For one of these kind of charities, cultural charities (hmchn), the tax benefits changed in 2010
    I want to find out if this tax reform has a significant effect on cultural donations by comparing the cultural donations after 2010 to the donations to the other charities after 2010 (on which the tax reform should have no impact).
    It seems to me that a DiD approach would make sense, but the problem is that the treatment happened for all households. There is no control group in that sense.
    However, there is the control group consisting of the other kinds of charities.

    A different method I tried was by creating different models and comparing the coefficients:

    Code:
    gen after2010 = year>=2010
    reg hmcan after2010
    est store hmcan
    reg hmchn after2010
    est store hmchn
    suest hmcan hmchn
    lincom [hmcan_mean]:after2010 - [hmchn_mean]:after2010
    And then doing this for all different charities
    Or comparing it with all other donations:

    Code:
    gen otherdonations = max(hmchn, hmhln, hmian)
    reg otherdonations after2010
    est store otherdonations
    suest hmcan otherdonations
    lincom [hmcan_mean]:after2010 - [otherdonations_mean]:after2010
    Although I do get a coefficient and a p-value this way, it feels like there are reasons why I am not allowed to do it simply like this.

    Is there a way that I can use a DiD approach on this problem? Or do I need a different approach altogether?

    I am using Stata 17 on Windows
    Last edited by Johannes de Ruig; 18 Jun 2024, 07:02.

  • #2
    You could -reshape- the data to long layout, making separate observations for each charity_type instead of separate variables. Then you could do a DID analysis like this:

    Code:
    reshape long hm, i(id year) j(ct) string
    label define charity_type 1    "chn" //    SET CHN AS THE LOWEST VALUE
    encode ct, gen(charity_type) label(charity_type)
    label list charity_type
    
    gen byte pre_post = year >= 2010
    rename hm donated
    
    table (charity_type) (pre_post), statistic(fvpercent 1.donated) ///
        sformat("%s%%" fvpercent)  nototals
    
    xtset id
    xtreg donated i.charity_type##i.pre_post i.year, fe
    testparm i.charity_type#1.pre_post

    Comment

    Working...
    X