Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parallel Trend Asumption in DIfference-in-Difference Estimation

    Hi,
    I am working on a DID model which data set is shown bellow. My DID model will be consist of a treatment area and a control area for the year from 2009 to 2019. The Independent variable is a total number of five which are loggrdp, logpopden, logvkt, cng, petrol and other variables are dependent variables except Diff, treat, post. The treatment occurred in 2016 and treat represent treatment area, post represent time of treatment, Diff is the dummy variable for treatment. In order to check the parallel trend assumption graphically and statistically i need some guidance for the command of STATA.
    Click image for larger version

Name:	DataSet_DiD.PNG
Views:	1
Size:	62.4 KB
ID:	1647158

    -
    Last edited by Sakib Nazmus; 28 Jan 2022, 11:04.

  • #2
    Please read the FAQ which discusses how to ask a question.

    I can't help you until I've a reproducible example right here on screen, one with context. I can't even tell if you have panel data or a unique identifier, and I also don't understand why your years are sorted the way they are. I'm not saying this to be mean, I'm saying this because I can't help you with decontextualized screenshots and until you've followed the FAQ instructions. This is an important question for your research, one I could give many thoughts on, but not with the question as its presently formatted.

    If nobody's told you yet, welcome to Statalist.

    Comment


    • #3
      Sorry for the Incomplete info that i have shared. I am posting first time here. This is a panel data where DHK is treatment area and CTG is control area. STATA_DATASET.xlsx This is my data set uploaded as Excel file.

      Comment


      • #4
        You need to upload your data using the dataex command.

        Comment


        • #5
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str3 area int year float(loggrdp logvkt logpopden cng petrol tti co2capita co2ac logdt logcost logco2 logcostac) byte(treat post diff)
          "DHK" 2019 5.08289 4.89612 4.10358 .516 1.12 3.97927  .35639 .850572 3.26923  3.5905 3.85907 2.66114 1 1 1
          "DHK" 2018 5.03987 4.88075 4.08819  .48 1.12 3.68113   .2818 .672554 3.14392 3.46519 3.74172  2.5512 1 1 1
          "DHK" 2017  4.9995  4.8653 4.07275  .48 1.12   3.383 .229172  .54695 3.03242  3.3537 3.63649 2.45515 1 1 1
          "DHK" 2016 4.94728 4.84986 4.05731  .42 1.12 3.08486 .189826 .453046 2.93002 3.25129 3.53924 2.36819 1 1 1
          "DHK" 2015 4.89229 4.83441 4.04186  .42  1.3 2.78673 .158951 .379358 2.83308 3.15435 3.44671  2.2867 1 0 0
          "DHK" 2014 4.83983 4.81897 4.02641 .201  1.3 2.68755 .142177 .339324 2.77097 3.09224 3.38283 2.24002 1 0 0
          "DHK" 2013 4.77812 4.80352 4.01098 .201 1.15 2.58836 .127138 .303432 2.70863  3.0299 3.31883 2.19313 1 0 0
          "DHK" 2012 4.72709 4.78808 3.99552 .201 1.15 2.48918 .113573 .271058 2.64576 2.96703 3.25438 2.14571 1 0 0
          "DHK" 2011 4.71144 4.77263 3.98009 .201 1.09    2.39 .101267 .241686 2.58201 2.90328 3.18913  2.0974 1 0 0
          "DHK" 2010 4.66381 4.75719 3.96466 .201 1.09   2.331 .091581 .218571 2.52446 2.84573 3.13002 2.05529 1 0 0
          "DHK" 2009  4.6127 4.74174  3.9492 .201 1.17   2.272 .082753 .197501 2.46652 2.78779 3.07056 2.01281 1 0 0
          "CTG" 2019 4.56001 3.90558 3.63007 .516 1.12   3.255 .351007 .482152 2.57161 2.88212 3.23684 2.32847 0 1 0
          "CTG" 2018   4.517 3.89674 3.62123  .48 1.12 3.09642 .325438  .44703 2.53137 2.84188 3.19515 2.29706 0 1 0
          "CTG" 2017 4.47662 3.88809 3.61258  .48 1.12 2.93784 .300237 .412414 2.48916 2.79967  3.1515 2.26351 0 1 0
          "CTG" 2016  4.4244 3.87935 3.60385  .42 1.12 2.77925 .275274 .378123  2.4442 2.75472 3.10506 2.22728 0 1 0
          "CTG" 2015 4.36941 3.87073 3.59522  .42  1.3 2.62067 .250414 .343975   2.396 2.70651 3.05533 2.18771 0 0 0
          "CTG" 2014 4.31695 3.86212 3.58661 .201  1.3 2.49246 .225925 .310336  2.3427 2.65321 3.00203 2.14301 0 0 0
          "CTG" 2013 4.25524 3.85344 3.57793 .201 1.15 2.36425 .202279 .277856 2.28601 2.59652 2.94534   2.095 0 0 0
          "CTG" 2012 4.20421 3.84479 3.56928 .201 1.15 2.23604 .179331 .246334 2.22506 2.53557 2.88439  2.0427 0 0 0
          "CTG" 2011 4.18856 3.83617 3.56066 .201 1.09 2.10783 .156944 .215583 2.15852 2.46904 2.81786 1.98479 0 0 0
          "CTG" 2010 4.14094 3.82747 3.55197 .201 1.09 1.97962 .134985 .185419 2.08437 2.39488  2.7437 1.91933 0 0 0
          "CTG" 2009 4.08982 3.81882 3.54331 .201 1.17 1.85141 .113321  .15566 1.99974 2.31025 2.65908 1.84336 0 0 0
          end

          Comment


          • #6
            Okay thanks, this is so much more helpful.

            Okay I see what you've got here. The issue is, you're doing a 2 by many DD study here... that is, you've only one treated unit and one comparison unit. Not bad, but not good either.

            The reality of the matter is, you can't do very much here. Unless I'm mistaken, which is perfectly possible, the most you can rely on is a graphical validation of parallel trends. So I'll just do an example for CO2 per capita.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str3 area int year float(loggrdp logvkt logpopden cng petrol tti co2capita co2ac logdt logcost logco2 logcostac) byte(treat post diff)
            "DHK" 2019 5.08289 4.89612 4.10358 .516 1.12 3.97927  .35639 .850572 3.26923  3.5905 3.85907 2.66114 1 1 1
            "DHK" 2018 5.03987 4.88075 4.08819  .48 1.12 3.68113   .2818 .672554 3.14392 3.46519 3.74172  2.5512 1 1 1
            "DHK" 2017  4.9995  4.8653 4.07275  .48 1.12   3.383 .229172  .54695 3.03242  3.3537 3.63649 2.45515 1 1 1
            "DHK" 2016 4.94728 4.84986 4.05731  .42 1.12 3.08486 .189826 .453046 2.93002 3.25129 3.53924 2.36819 1 1 1
            "DHK" 2015 4.89229 4.83441 4.04186  .42  1.3 2.78673 .158951 .379358 2.83308 3.15435 3.44671  2.2867 1 0 0
            "DHK" 2014 4.83983 4.81897 4.02641 .201  1.3 2.68755 .142177 .339324 2.77097 3.09224 3.38283 2.24002 1 0 0
            "DHK" 2013 4.77812 4.80352 4.01098 .201 1.15 2.58836 .127138 .303432 2.70863  3.0299 3.31883 2.19313 1 0 0
            "DHK" 2012 4.72709 4.78808 3.99552 .201 1.15 2.48918 .113573 .271058 2.64576 2.96703 3.25438 2.14571 1 0 0
            "DHK" 2011 4.71144 4.77263 3.98009 .201 1.09    2.39 .101267 .241686 2.58201 2.90328 3.18913  2.0974 1 0 0
            "DHK" 2010 4.66381 4.75719 3.96466 .201 1.09   2.331 .091581 .218571 2.52446 2.84573 3.13002 2.05529 1 0 0
            "DHK" 2009  4.6127 4.74174  3.9492 .201 1.17   2.272 .082753 .197501 2.46652 2.78779 3.07056 2.01281 1 0 0
            "CTG" 2019 4.56001 3.90558 3.63007 .516 1.12   3.255 .351007 .482152 2.57161 2.88212 3.23684 2.32847 0 1 0
            "CTG" 2018   4.517 3.89674 3.62123  .48 1.12 3.09642 .325438  .44703 2.53137 2.84188 3.19515 2.29706 0 1 0
            "CTG" 2017 4.47662 3.88809 3.61258  .48 1.12 2.93784 .300237 .412414 2.48916 2.79967  3.1515 2.26351 0 1 0
            "CTG" 2016  4.4244 3.87935 3.60385  .42 1.12 2.77925 .275274 .378123  2.4442 2.75472 3.10506 2.22728 0 1 0
            "CTG" 2015 4.36941 3.87073 3.59522  .42  1.3 2.62067 .250414 .343975   2.396 2.70651 3.05533 2.18771 0 0 0
            "CTG" 2014 4.31695 3.86212 3.58661 .201  1.3 2.49246 .225925 .310336  2.3427 2.65321 3.00203 2.14301 0 0 0
            "CTG" 2013 4.25524 3.85344 3.57793 .201 1.15 2.36425 .202279 .277856 2.28601 2.59652 2.94534   2.095 0 0 0
            "CTG" 2012 4.20421 3.84479 3.56928 .201 1.15 2.23604 .179331 .246334 2.22506 2.53557 2.88439  2.0427 0 0 0
            "CTG" 2011 4.18856 3.83617 3.56066 .201 1.09 2.10783 .156944 .215583 2.15852 2.46904 2.81786 1.98479 0 0 0
            "CTG" 2010 4.14094 3.82747 3.55197 .201 1.09 1.97962 .134985 .185419 2.08437 2.39488  2.7437 1.91933 0 0 0
            "CTG" 2009 4.08982 3.81882 3.54331 .201 1.17 1.85141 .113321  .15566 1.99974 2.31025 2.65908 1.84336 0 0 0
            end
            
            egen id = group(area), label
            
            xtset id year, y
            
            
            graph close _all
            graph drop _all
            
            
            foreach v of var co2capita logcost {
            
            tw ///
            (line `v' year if id ==1, lcolor(pink)) /// Untreated
            (line `v' year if id==2, lcolor(black) lwidth(medthick)), /// Treated
            legend(order(1 "Untreated" 2 "Treated")) name(`v') xli(2016)
            }
            So, let's remind ourselves what basic PTA means: when we strip away the technical mathematics, PTA posits that the comparison unit is a good counterfactual for the treated unit, based on the idea that the units trends would continue in the same direction absent the intervention.

            Looking at the graph of CO2, the untreated unit doesn't appear at all to be a good comparison unit for the treated one. The treated unit appears to have a quadratic pre-intervention trend, whereas the untreated unit appears to have a linear trend. This implies that the treated unit has a different data generating process than the untreated, and that it would not be a good comparison unit for the treated one. Similar comments might be made about logcost. The untreated unit's trend appears to be almost flattening, whereas the treated unit's trend is persistently rising. They diverge in different directions before the intervention, and that's not what we want.

            Additionally, both outcomes trends for the treated unit begin to rise in 2015, a year before the intervention took place. This implies anticipation, especially when it's for both outcomes. Or put a little differently, it implies that there are dormant common factors which "wake up" in the year before the policy was passed; thus, how can we be sure it was the intervention that impacted the outcomes instead of something else? If you had more than one comparison unit, there'd be other ways to easily mitigate this issue like matching or synthetic controls, but in a two-unit setup, the most you could hope for aside from standard regression would be a simple two unit interrupted time series approach.

            My advice to you is this: I don't know what topic you're studying, but you've a few options before you: either do the simple DD approach and acknowledge parallel trends violations. Or, if at all possible, get data on additional comparison units so that you can better justify PTA or get around the issue altogether by doing a simple synthetic control approach. Sakib Nazmus

            Comment


            • #7
              Thanks a Lot! Actually my treatment initially started in 2015. I am assuming it started to impact from the beginning. Actually i did a DD analysis before this PTA testing in which this treatment shows a significant positive result. The command which i used is
              Code:
              reg tti Diff loggrdp logpopden logvkt cng petrol i.post, robust cluster(treat)

              Comment


              • #8
                Okay, so I was incorrect about anticipation then. Alright. Your specification for the DD equation is wrong, however. What you want is to interact the post variable with the treated variable, so

                Code:
                reg tti i.post##i.treat loggrdp logpopden logvkt cng petrol, robust cluster(treat)
                accomplishes this in Stata. You don't need to create your own DD coefficient, Stata's interaction terms do that for you.

                Comment


                • #9
                  Would you let me know the command of ''xtdidregress'' for any of the dependent variable from my DID model?

                  Comment


                  • #10
                    I don't understand the question. What do you mean?

                    Comment


                    • #11
                      I did the analysis with reg command before in stata 16. But in stata 17 there is a command xtdidregress which i want use for my analysis. It would be very helpful if you guide me in this matter.

                      Comment


                      • #12
                        I've never used xtdidreg, but the help file should be pretty explicit about how to use it.

                        Comment


                        • #13
                          From your valuable advice and then after some research i realized i was missing a basic thing which was my dataset is basically a panel data but i was using cross-sectional data set command to do my analysis. which is
                          Code:
                           reg tti Diff loggrdp logpopden logvkt cng petrol i.post, robust cluster(treat)
                          Now i think for panel data the command should be:
                          Code:
                          gen diff =treat*post
                          xtset treat year
                          xtreg tti diff loggrdp logpopden logvkt cng petrol i.post, robust cluster(treat)

                          I want to apply time fixed effect here thats why i used i.post and and my independent variable is correlated so i used robust cluster(treat) part

                          let me know if my command is right or wrong for my analysis.
                          Thank you. Jared Greathouse
                          Last edited by Sakib Nazmus; 02 Feb 2022, 09:32.

                          Comment


                          • #14
                            Both of these would work for DD, as would many others. You'll need to read the help files to know if one of these is right for you, I can't make that choice.

                            Comment

                            Working...
                            X