Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Classical DID, Parallel trend plot

    Dear All
    I would like to run a classical DID and below is a sample dataset for illustration purpose.

    Code:
     Example generated by -dataex-. For more info, type help dataex
    clear
    input str1 id int year float depvar byte treat int(indep_var1 indep_var2) float post
    "a" 2005     50 0  66 133 0
    "a" 2006     53 0  55 120 0
    "a" 2007     53 0  49 241 0
    "a" 2008     58 1  33 217 1
    "a" 2009     65 1  21 143 1
    "a" 2010     69 1  97  45 1
    "a" 2011     73 1 160 196 1
    "b" 2005     33 0 157 226 0
    "b" 2006     33 0 230 188 0
    "b" 2007     36 0  68 126 0
    "b" 2008     39 1 152 217 1
    "b" 2009     44 1 236 219 1
    "b" 2010     46 1 196 216 1
    "b" 2011     52 1 216 160 1
    "c" 2005     35 0 133  93 0
    "c" 2006     36 0  96 190 0
    "c" 2007     38 0 231 177 0
    "c" 2008     36 0  42 138 1
    "c" 2009     33 0 208 236 1
    "c" 2010     31 0 104 163 1
    "c" 2011     31 0  26  82 1
    "d" 2005     26 0 103 206 0
    "d" 2006     27 0  66 155 0
    "d" 2007     24 0  30  61 0
    "d" 2008     24 0 234  52 1
    "d" 2009     23 0 145 139 1
    "d" 2010     22 0 180  32 1
    "d" 2011     21 0 129  87 1
    "e" 2005     66 0  38 131 0
    "e" 2006     69 0 243 233 0
    "e" 2007     66 0  85 211 0
    "e" 2008     65 0 115  66 1
    "e" 2009     64 0 213  78 1
    "e" 2010     64 0 224 144 1
    "e" 2011     64 0 142 237 1
    "f" 2005     18 0 143  31 0
    "f" 2006     18 0  64  31 0
    "f" 2007     18 0 233 166 0
    "f" 2008     23 1 223 158 1
    "f" 2009     26 1 135  82 1
    "f" 2010     29 1 171 150 1
    "f" 2011     32 1 240  46 1
    "g" 2005  98.82 0  42  37 0
    "g" 2006 101.82 0  95  60 0
    "g" 2007 101.82 0  74  82 0
    "g" 2008 106.82 1  65  40 1
    "g" 2009 113.82 1 100  55 1
    "g" 2010 117.82 1  86  77 1
    "g" 2011 121.82 1  77  44 1
    "h" 2005  81.82 0  45  15 0
    "h" 2006  81.82 0  91  59 0
    "h" 2007  84.82 0  73  19 0
    "h" 2008  87.82 1  89  71 1
    "h" 2009  92.82 1  33  60 1
    "h" 2010  94.82 1  54  48 1
    "h" 2011 100.82 1  63  36 1
    "i" 2005  83.82 0  32  16 0
    "i" 2006  84.82 0  48  98 0
    "i" 2007  86.82 0  10  46 0
    "i" 2008  84.82 0  19  88 1
    "i" 2009  81.82 0  61  27 1
    "i" 2010  79.82 0  21  91 1
    "i" 2011  79.82 0  48  38 1
    "j" 2005  74.82 0  36  22 0
    "j" 2006  75.82 0  28  53 0
    "j" 2007  72.82 0  36  94 0
    "j" 2008  72.82 0  32  51 1
    "j" 2009  71.82 0 100  11 1
    "j" 2010  70.82 0  89  82 1
    "j" 2011  69.82 0  54  92 1
    "k" 2005 114.82 0  23  64 0
    "k" 2006 117.82 0  45  14 0
    "k" 2007 114.82 0  94  23 0
    "k" 2008 113.82 0  25  13 1
    "k" 2009 112.82 0  39  91 1
    "k" 2010 112.82 0  36  35 1
    "k" 2011 112.82 0  74  18 1
    "l" 2005  66.82 0  69  47 0
    "l" 2006  66.82 0  24  38 0
    "l" 2007  66.82 0  44  24 0
    "l" 2008  71.82 1  57  35 1
    "l" 2009  74.82 1  72  60 1
    "l" 2010  77.82 1  36  23 1
    "l" 2011  80.82 1  55  97 1
    end
    [/CODE]

    In the above data treat is given 1 for treated firms (a,b,f,g,h,l),and 0 otherwise. post is given for treatment year which is from 2008 onwards (till 2011). Given this I started with a parallel trend plot of my depvar and I ran the following command

    Code:
    ssc install lgraph
    lgraph depvar year, by( treat )
    L_Graph.gph

    I am not sure whether this graph makes any sense or not (or is it really correct). I followed it from one of post of George Ford (https://www.statalist.org/forums/for...84#post1723184)


    Next I ran a DID with below code and here is what I got
    Code:
    . encode id, gen (ID)
    
    . xtset ID year
    
    Panel variable: ID (strongly balanced)
     Time variable: year, 2005 to 2011
             Delta: 1 unit
    
    . xtreg depvar i.post##i.treat i.year, fe vce (r)
    note: 0b.post#1.treat identifies no observations in the sample.
    note: 1.post#1.treat omitted because of collinearity.
    note: 2011.year omitted because of collinearity.
    
    Fixed-effects (within) regression               Number of obs     =         84
    Group variable: ID                              Number of groups  =         12
    
    R-squared:                                      Obs per group:
         Within  = 0.7678                                         min =          7
         Between = 0.0000                                         avg =        7.0
         Overall = 0.0119                                         max =          7
    
                                                    F(5,11)           =          .
    corr(u_i, Xb) = -0.0969                         Prob > F          =          .
    
                                        (Std. err. adjusted for 12 clusters in ID)
    ------------------------------------------------------------------------------
                 |               Robust
          depvar | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          1.post |   .0833333   1.292325     0.06   0.950    -2.761054    2.927721
         1.treat |   14.83333   .8870762    16.72   0.000     12.88089    16.78577
                 |
      post#treat |
            0 1  |          0  (empty)
            1 1  |          0  (omitted)
                 |
            year |
           2006  |   1.333333   .3929874     3.39   0.006     .4683738    2.198293
           2007  |   1.166667   .6146741     1.90   0.084     -.186222    2.519555
           2008  |  -4.666667   2.505364    -1.86   0.089    -10.18094    .8476017
           2009  |         -3   1.397337    -2.15   0.055    -6.075519    .0755189
           2010  |         -2   .7929615    -2.52   0.028    -3.745296   -.2547036
           2011  |          0  (omitted)
                 |
           _cons |      62.41   .4615509   135.22   0.000     61.39413    63.42587
    -------------+----------------------------------------------------------------
         sigma_u |  31.013364
         sigma_e |  2.7728405
             rho |  .99206962   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . xtreg depvar i.post##i.treat i.year, re
    note: 0.post#1.treat identifies no observations in the sample.
    note: 1.post#1.treat omitted because of collinearity.
    note: 2011.year omitted because of collinearity.
    
    Random-effects GLS regression                   Number of obs     =         84
    Group variable: ID                              Number of groups  =         12
    
    R-squared:                                      Obs per group:
         Within  = 0.7678                                         min =          7
         Between = 0.0000                                         avg =        7.0
         Overall = 0.0119                                         max =          7
    
                                                    Wald chi2(7)      =     217.05
    corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
          depvar | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          1.post |   .0936147   1.279556     0.07   0.942    -2.414268    2.601498
         1.treat |   14.81277   1.215402    12.19   0.000     12.43063    17.19492
                 |
      post#treat |
            0 1  |          0  (empty)
            1 1  |          0  (omitted)
                 |
            year |
           2006  |   1.333333   1.126038     1.18   0.236    -.8736607    3.540327
           2007  |   1.166667   1.126038     1.04   0.300    -1.040327    3.373661
           2008  |  -4.666667   1.126038    -4.14   0.000    -6.873661   -2.459673
           2009  |         -3   1.126038    -2.66   0.008    -5.206994   -.7930059
           2010  |         -2   1.126038    -1.78   0.076    -4.206994    .2069941
           2011  |          0  (omitted)
                 |
           _cons |      62.41   9.277165     6.73   0.000     44.22709    80.59291
    -------------+----------------------------------------------------------------
         sigma_u |  32.188195
         sigma_e |  2.7728405
             rho |  .99263376   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    .
    What is happening here, and why my interaction related to DID gets omitted. What is the problem with the data and what is the way to go about this. Please help in this regard as I need some assistance with respect to plot and coefficients
    Attached Files
    Last edited by lal mohan kumar; 11 Apr 2024, 00:33.

  • #2
    Ial:
    why not starting off from -xtdidregress-?
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Dear Carlo Lazzaro Thank you very much for the swift response. I am learning -xtdidregress- as I havent used it before. However, to ensure that I understand basics very clearly, I would like to start with trend plots which I dont get that we usually see in articles. I tried to use

      Code:
      preserve
      collapse (mean) depvar, by(treat year)
      reshape wide depvar, i(year) j(treat)
      graph twoway connect depvar* year if year < 2008  
      restore
      which was suggested here ( https://www.statalist.org/forums/for...18#post1601218), but that too is not helping me as the graph looks bizzare.

      Comment


      • #4
        Ial:
        as Stata warns you about, interactions are omitted due to no observations and perfect collinearity, respectively.
        In addition, your graphs does not show any treatment year to distinguish pre from post treatment period for the treated.
        Again, provided that your dataset is set up is correctly spoecified for DID, switching to -xtdidregress- would give you what you're after via the -estat trendplots- option:
        Code:
        . use https://www.stata-press.com/data/r18/parallelt
        (Simulated data to test parallel-trends assumption)
        
        . xtset id1
        
        Panel variable: id1 (unbalanced)
        
        . xtdidregress (y1 c.x1##c.x2) (treated1), group(id1) time(t1)
        
        Treatment and time information
        
        Time variable: t1
        Control:       treated1 = 0
        Treatment:     treated1 = 1
        -----------------------------------
                     |   Control  Treatment
        -------------+---------------------
        Group        |
                 id1 |       102         98
        -------------+---------------------
        Time         |
             Minimum |         1          6
             Maximum |         1          6
        -----------------------------------
        
        Difference-in-differences regression                     Number of obs = 2,000
        Data type: Longitudinal
        
                                                     (Std. err. adjusted for 200 clusters in id1)
        -----------------------------------------------------------------------------------------
                                |               Robust
                             y1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        ------------------------+----------------------------------------------------------------
        ATET                    |
                       treated1 |
        (Treated vs Untreated)  |   .5069426   .0220218    23.02   0.000     .4635166    .5503686
        -----------------------------------------------------------------------------------------
        Note: ATET estimate adjusted for covariates, panel effects, and time effects.
        
        . estat trendplots
        
        .
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Dear Carlo Lazzaro Thanks for the sample data and illustration (I didnt know about them). However, I am unable to understand the data as it is not in the typical panel data long form. Do you know any such sample panel data that works for classical DID. I hope I am not troubling you but if there is a panel data that amenble for Classical DID illustration, it will be extremely helpful

          Comment


          • #6
            Ial:
            sorry, no.
            But for DID guidance before -didregress- see PowerPoint Presentation (princeton.edu)
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Dear Carlo Lazzaro Thanks for providing me with an excellent reference and I tried to learn from it and Stata forum and I have used the following commands

              Code:
              use "http://www.princeton.edu/~otorres/WDI.dta", clear
              * Fake event X happens in 2009 affecting all countries
              * Creating the before/after dummy variable: 0 = before, 1 =after
              gen after = (year >= 2009) if !missing(year)
              merge m:1 country using"http://www.princeton.edu/~otorres/Treated.dta",gen(merge1)
              *The untreated units will have a missing value (".")
              replace treated = 0 if treated ==.
              use "http://www.princeton.edu/~otorres/WDI.dta", clear
              * Fake event X happens in 2009 affecting all countries
              * Creating the before/after dummy variable: 0 = before, 1 =after
              gen after = (year >= 2009) if !missing(year)
              merge m:1 country using"http://www.princeton.edu/~otorres/Treated.dta",gen(merge1)
              *The untreated units will have a missing value (".")
              replace treated = 0 if treated ==.
              gen did = after * treated
              encode country, gen(country1)
              xtset country1 year 
              
              *Plotting for Parallel trend
              lgraph gdppc year, by( treated ) xline(2009)
              Graph_Way1.gph
              
              xtreg gdppc did imports labor i.year , fe vce(cluster country1)
              
              Fixed-effects (within) regression               Number of obs     =      2,772
              Group variable: country1                        Number of groups  =        126
              
              R-squared:                                      Obs per group:
                   Within  = 0.3057                                         min =         22
                   Between = 0.0925                                         avg =       22.0
                   Overall = 0.0972                                         max =         22
              
                                                              F(24,125)         =       9.91
              corr(u_i, Xb) = -0.0849                         Prob > F          =     0.0000
              
                                           (Std. err. adjusted for 126 clusters in country1)
              ------------------------------------------------------------------------------
                           |               Robust
                     gdppc | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                       did |   1083.795   553.1513     1.96   0.052    -10.96023     2178.55
                   imports |   1.62e-08   5.42e-09     2.98   0.003     5.44e-09    2.69e-08
                     labor |  -.0001776   .0000453    -3.92   0.000    -.0002673   -.0000878
                           |
                      year |
                     2001  |   192.3805   33.94936     5.67   0.000     125.1905    259.5705
                     2002  |   383.6992   67.48982     5.69   0.000     250.1285    517.2699
                     2003  |   596.9964   96.40343     6.19   0.000     406.2021    787.7908
                     2004  |   1010.178   171.1637     5.90   0.000      671.424    1348.933
                     2005  |   1317.479   210.0023     6.27   0.000      901.858    1733.099
                     2006  |    1720.05   274.9326     6.26   0.000     1175.925    2264.176
                     2007  |   2172.455   358.9161     6.05   0.000     1462.115    2882.795
                     2008  |   2208.525   364.8852     6.05   0.000     1486.372    2930.678
                     2009  |   1311.352   307.3992     4.27   0.000      702.971    1919.734
                     2010  |   1563.268   352.8561     4.43   0.000     864.9218    2261.614
                     2011  |   1798.775   419.9763     4.28   0.000     967.5901     2629.96
                     2012  |   1915.791   456.5907     4.20   0.000     1012.142    2819.441
                     2013  |   2084.955   512.9638     4.06   0.000     1069.736    3100.174
                     2014  |    2234.28    499.513     4.47   0.000     1245.682    3222.878
                     2015  |   2345.149   409.9833     5.72   0.000     1533.741    3156.557
                     2016  |   2555.844   428.4298     5.97   0.000     1707.928     3403.76
                     2017  |    2841.42   472.1121     6.02   0.000     1907.051    3775.788
                     2018  |   3100.052   508.6039     6.10   0.000     2093.462    4106.642
                     2019  |   3284.786   513.2284     6.40   0.000     2269.043    4300.529
                     2020  |   2330.943   476.1929     4.89   0.000     1388.498    3273.387
                     2021  |     3034.1   517.2743     5.87   0.000      2010.35     4057.85
                           |
                     _cons |   13832.09   529.9392    26.10   0.000     12783.28    14880.91
              -------------+----------------------------------------------------------------
                   sigma_u |  18555.692
                   sigma_e |  2562.3242
                       rho |  .98128842   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              
              . 
              
              
              . xtdidregress (gdppc imports labor) (did), group(country1) time(year)
              
              Number of groups and treatment time
              
              Time variable: year
              Control:       did = 0
              Treatment:     did = 1
              -----------------------------------
                           |   Control  Treatment
              -------------+---------------------
              Group        |
                  country1 |        58         68
              -------------+---------------------
              Time         |
                   Minimum |      2000       2009
                   Maximum |      2000       2009
              -----------------------------------
              
              Difference-in-differences regression                     Number of obs = 2,772
              Data type: Longitudinal
              
                                           (Std. err. adjusted for 126 clusters in country1)
              ------------------------------------------------------------------------------
                           |               Robust
                     gdppc | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
              ATET         |
                       did |
                 (1 vs 0)  |   1083.795   553.1513     1.96   0.052    -10.96023     2178.55
              ------------------------------------------------------------------------------
              Note: ATET estimate adjusted for covariates, panel effects, and time effects.

              Code:
              estat trendplots, ytitle(GDP pc)
              Graph_Estat plot.gph
              .
              On inspection it seems to me that both, parallel plot as per way 1 and estat trendplots are same. Is that true? Also how to interpret the estat graph
              Attached Files

              Comment


              • #8
                Ial:
                yes, it seem so.
                You can check whether the parallel trends hyposthesis is proved after -didregress- or -xtdidregress- via:
                Code:
                estat ptrends
                The null of the test is that the trends are parallel in the pretreatement period, as it seems to be the case when visually inspecting your graphs.
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment


                • #9
                  Dear Carlo Lazzaro Thank you once again for -estat ptrends- also. However, there is a caveat with these estat plots as it works for Balanced panel only. For instance, in the same dataset if we remove one observation after the treatment (I removed Albania observation for the year 2009) then Stata shows-treatment assignment times vary; not allowed with estat ptrend. In such cases we have to use lgraph as estat trendplots wont work.

                  Comment

                  Working...
                  X