Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reproducing "estat trendplots" linear-trends model

    Hello Statalist community,

    I am doing an analysis using a diff in diff methodology. As I use Stata 17, I have the opportunity to use the command “didregress” and “estat trendplots”.

    However, “estat trendplots” do not provide confidences intervals for the “Observed means” and “Liner-trends model”. It is quite easy to reproduce the “Observed means” graph and to add confidences intervals, but I struggle to even just reproduce the linear-trends model.

    Is there a way to reproduce the exact same graph and to add the confidence interval? I think as it is a Stata command, there is no .ado that I could read to understand the process, am I right?



    I add the code I use with didregress and an extract of my sample.
    didregress (wdds hheadage hheadsex hhmemtotal i.year) (Treat) [pweight=perweight], group(region) time(period)


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(wdds Treat Time Treatment hheadage hheadsex hhmemtotal) int year float period double perweight float region
    0 0 1 0 31 1  3 2014 2 1.382122 47
    0 0 0 1 36 1  6 2008 1   .83317  2
    5 1 1 1 63 1  5 2014 2  .713237  2
    0 0 0 0 27 1  4 2008 1  .999403 47
    2 0 0 0 86 1 36 2005 0 1.294421 47
    1 0 0 0 55 1 11 2008 1   .83317  1
    0 0 1 0 24 1  2 2014 2 1.382122 47
    0 0 1 0 41 1  3 2014 2  .713237  1
    0 0 1 0 26 1  2 2014 2  .713237  1
    4 0 1 0 30 1  4 2016 2   .97527 39
    0 0 1 0 25 2  2 2016 2  .858931 39
    0 0 1 0 28 1  3 2014 2  .863929  1
    3 0 1 0 34 1  5 2014 2  .713237  1
    2 0 0 0 53 2  8 2013 1  .278545 27
    0 0 0 0 29 1  4 2005 0  .885734 47
    2 0 0 1 35 1  6 2005 0  .709445  2
    0 0 0 1 35 1  5 2011 1 1.283543 50
    1 0 1 0 31 1  5 2014 2 1.121907 47
    0 0 0 0 33 1  3 2008 0    1.223 27
    0 0 0 1 37 1  6 2006 0  .868114 50
    end
    Thank you in advance.

    Regards

  • #2
    Hi Antoine,

    From a purely technically point of view, it would be possible to compute confidence intervals for the predicted values from the linear trends model. However, I can't see a use case where this would make a lot of sense. Notice that the predicted values from the linear trends model are not to be interpreted as meaningful expected values of the outcome, for example in the sense that they could be interpreted as potential outcome means. That is because, in order to compute meaningful predicted values for both groups of (ever) treated and controls (i.e., never treated) for both pre- and post-intervention time periods, one would have to estimate two distinct intercepts for both groups. However, with the linear trend model, only one intercept is identifiable, and so the predicted values will, by construction of the model, be constrained to be equal at one of the time points, and so if you were to look at overlapping or non-overlapping confidence intervals at certain time points, those would arbitrarily depend on the chosen centering point. Using a model with time uncentered, that centering point would be time==0.

    With that said, notice that the model used with estat trendplots is the same as the model used for the test implemented in estat ptrends. To look at the numerical results of this model, we can use estat ptrends with the verbose option:
    Code:
    * Example data:
    webuse hospdd
    
    * Example model:
    didregress (satis frequency) (procedure), group(hospital) time(month)
    
    * Linear trends model results:
    estat ptrends, verbose
    This yields the following output:
    Code:
    . * Example data:
    . webuse hospdd
    (Artificial hospital admission procedure data)
    . 
    . * Example model:
    . didregress (satis frequency) (procedure), group(hospital) time(month)
    
    Number of groups and treatment time
    
    Time variable: month
    Control:       procedure = 0
    Treatment:     procedure = 1
    -----------------------------------
                 |   Control  Treatment
    -------------+---------------------
    Group        |
        hospital |        28         18
    -------------+---------------------
    Time         |
         Minimum |         1          4
         Maximum |         1          4
    -----------------------------------
    
    Difference-in-differences regression                     Number of obs = 7,368
    Data type: Repeated cross-sectional
    
                                  (Std. err. adjusted for 46 clusters in hospital)
    ------------------------------------------------------------------------------
                 |               Robust
           satis | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET         |
       procedure |
           (New  |
             vs  |
           Old)  |   .8479879   .0321143    26.41   0.000     .7833063    .9126694
    ------------------------------------------------------------------------------
    Note: ATET estimate adjusted for covariates, group effects, and time effects.
    
    . 
    . * Linear trends model results:
    . estat ptrends, verbose
    
    Linear regression, absorbing indicators             Number of obs     =  7,368
    Absorbed variable: hospital                         No. of categories =     46
                                                        F(10, 45)         =  98.36
                                                        Prob > F          = 0.0000
                                                        R-squared         = 0.5366
                                                        Adj R-squared     = 0.5331
                                                        Root MSE          = 0.7214
    
                                  (Std. err. adjusted for 46 clusters in hospital)
    ------------------------------------------------------------------------------
           satis | Coefficient  Legend
    -------------+----------------------------------------------------------------
           month |
       February  |  -.0041014  _b[2.month]
          March  |   .0329811  _b[3.month]
          April  |   .0111939  _b[4.month]
            May  |  -.0018237  _b[5.month]
           June  |  -.0031572  _b[6.month]
           July  |  -.0174126  _b[7.month]
                 |
       procedure |
            New  |   .7335747  _b[1.procedure]
       frequency |   .0537506  _b[frequency]
                 |
        __000004#|
        __000005#|
         c.month |
            1 0  |  -.0132409  _b[1.__000004#0b.__000005#c.month]
            1 1  |   .0165894  _b[1.__000004#1.__000005#c.month]
                 |
           _cons |   3.317235  _b[_cons]
    ------------------------------------------------------------------------------
    
    Testing coefficient for:
    _b[1.__000004#0b.__000005#c.month]=0
    
     ( 1)  1.__000004#0b.__000005#c.month = 0
    
           F(  1,    45) =    0.55
                Prob > F =    0.4616
    
    Parallel-trends test (pretreatment time period)
    H0: Linear trends are parallel
    
    F(1, 45) =   0.55
    Prob > F = 0.4616
    Here, the results of interest are the two coefficients near the bottom of the table. The first one captures the difference in linear slopes between (ever) treated and controls (never treated) in the pre-intervention era, and the second one captures that difference in the post-intervention era. If the first coefficient is zero, pre-intervention trends would be perfectly parallel. Now, we could replicate these results 'manually' using an OLS estimator. Here, we use Stata's areg command:
    Code:
    * Replicating results from linear trends model:
    bys hospital : egen evertreated = max(procedure)
    gen post = month > 3
    areg satis frequency i.month 1.post#1.evertreated       ///
               i.post#1.evertreated#c.month,                ///
               absorb(hospital) vce(cluster hospital)
    This yields the following output:
    Code:
    . * Replicating results from linear trends model:
    . bys hospital : egen evertreated = max(procedure)
    
    . gen post = month > 3
    
    . areg satis frequency i.month 1.post#1.evertreated       ///
    >            i.post#1.evertreated#c.month,                ///
    >            absorb(hospital) vce(cluster hospital)
    
    Linear regression, absorbing indicators             Number of obs     =  7,368
    Absorbed variable: hospital                         No. of categories =     46
                                                        F(10, 45)         =  98.36
                                                        Prob > F          = 0.0000
                                                        R-squared         = 0.5366
                                                        Adj R-squared     = 0.5331
                                                        Root MSE          = 0.7214
    
                                  (Std. err. adjusted for 46 clusters in hospital)
    ------------------------------------------------------------------------------
                 |               Robust
           satis | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
       frequency |   .0537506   .0189545     2.84   0.007     .0155742     .091927
                 |
           month |
       February  |  -.0041014    .021364    -0.19   0.849    -.0471307    .0389279
          March  |   .0329811   .0257957     1.28   0.208    -.0189742    .0849363
          April  |   .0111939   .0228145     0.49   0.626    -.0347568    .0571446
            May  |  -.0018237   .0243447    -0.07   0.941    -.0508564     .047209
           June  |  -.0031572   .0188673    -0.17   0.868     -.041158    .0348435
           July  |  -.0174126   .0264676    -0.66   0.514    -.0707211     .035896
                 |
            post#|
     evertreated |
            1 1  |   .7335747   .0889709     8.25   0.000     .5543782    .9127713
                 |
            post#|
     evertreated#|
         c.month |
            0 1  |  -.0132409   .0178292    -0.74   0.462    -.0491507    .0226689
            1 1  |   .0165894   .0136877     1.21   0.232    -.0109791    .0441578
                 |
           _cons |   3.317235   .0510073    65.03   0.000     3.214501    3.419969
    ------------------------------------------------------------------------------
    As we can see we are able to replicate the results of estat ptrends. In order to compute averaged predicted values including confidence intervals, we could use Stata's margins command. However, for that we would need to fit a slightly reparameterized (but equivalent) model. We also center the variable that we use for continuous time around the minimum value of time such that the averaged predictions are constrained to be equal across treatment groups at month==1 in this case:
    Code:
    * Different parameterization and centered time:
    sum month, mean
    gen cmonth = month - r(min)
    areg satis frequency i.month i.post##ib0.evertreated    ///
               i.post#ib0.evertreated#c.cmonth,             ///
               absorb(hospital) vce(cluster hospital)
    We could now calculate the predicted values for both treatment groups at month==1:
    Code:
    * Predicted values for controls and treated at time month==1
    margins , at(post=0 evertreated=0 cmonth=0 month=1)     ///
              at(post=0 evertreated=1 cmonth=0 month=1)     ///
              atmeans noestimcheck
    This yields the following result:
    Code:
    . * Predicted values for controls and treated at time month==1
    . margins , at(post=0 evertreated=0 cmonth=0 month=1)     ///
    >           at(post=0 evertreated=1 cmonth=0 month=1)     ///
    >           atmeans noestimcheck
    
    Adjusted predictions                                     Number of obs = 7,368
    Model VCE: Robust
    
    Expression: Linear prediction, predict()
    1._at: frequency   = 2.473398 (mean)
           month       =        1
           post        =        0
           evertreated =        0
           cmonth      =        0
    2._at: frequency   = 2.473398 (mean)
           month       =        1
           post        =        0
           evertreated =        1
           cmonth      =        0
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   3.444675   .0113515   303.46   0.000     3.421812    3.467538
              2  |   3.444675   .0113515   303.46   0.000     3.421812    3.467538
    ------------------------------------------------------------------------------
    The predictions for month==4, the first post-intervention time point, would be the following:
    Code:
    . * Predicted values for controls and treated at time month==4
    . margins , at(post=1 evertreated=0 cmonth=3 month=4)     ///
    >           at(post=1 evertreated=1 cmonth=3 month=4)     ///
    >           atmeans noestimcheck
    
    Adjusted predictions                                     Number of obs = 7,368
    Model VCE: Robust
    
    Expression: Linear prediction, predict()
    1._at: frequency   = 2.473398 (mean)
           month       =        4
           post        =        1
           evertreated =        0
           cmonth      =        3
    2._at: frequency   = 2.473398 (mean)
           month       =        4
           post        =        1
           evertreated =        1
           cmonth      =        3
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   3.455869   .0177006   195.24   0.000     3.420218     3.49152
              2  |   4.269042   .0335431   127.27   0.000     4.201483    4.336601
    ------------------------------------------------------------------------------
    These are the point estimates shown in the linear trend model results of estat trendplots:
    Code:
    webuse hospdd
    didregress (satis frequency) (procedure), group(hospital) time(month)
    estat ptrends
    Click image for larger version

Name:	trendpl.png
Views:	1
Size:	58.8 KB
ID:	1629434


    As mentioned earlier, the choice of centering point for the continuous time variable is arbitrary, and if we were to use month==3 as centering point, for instance, the predicted values would be constrained to equality at that point, and then the predicted values and confidence interval for month==4, say, would be different. In short, this model and graph should really only be used for checking linear trends across groups, which here we do by looking at the differences in linear slopes across both groups within the pre-intervention era, and the model is not suited to (meaningfully) predict levels of the outcome for both groups.

    I hope this helps,
    Joerg

    Comment


    • #3
      Hello Joerg,

      Thank you for this well-detailled answer. It was exactly what I need to acquire a better understanding of how "estat trends" worked.

      Antoine

      Comment


      • #4
        Hi Joerg,
        I know this topic was written a while ago, but I am struggling to find the right coding to go from the fitted model (with areg) to construct the graph of the linear-trends model. Could you share how to create manually the right graph?
        Thanks for your help!
        Israel

        Comment


        • #5
          Hi Joerg Luedicke (StataCorp)

          I am trying to use the didregress command and have the same treatment period ie the same number in minimum and maximum rows of the Treatment column but a different min and max in the Control column.
          I can run estat ptrends and get a valid result but Stata is not letting me run estat trendplots which I would really like to do.

          The error message I get is "treatment assignment times vary; not allowed with estat trendplots".

          Has anyone else encountered this problem? I would really appreciate some help with how to fix it!

          Many thanks

          Comment


          • #6
            Different min/max times in the control column should, by itself, not cause a problem. Also, if estat ptrends doesn't complain, estat trendplots shouldn't complain either. It might be that you need to update your Stata. If you enter the following in Stata
            Code:
            update query
            the current update level should be
            Code:
            Current update level:    10 Jan 2023  (what's new)
            If you are seeing an older date, please follow the instructions on the screen to update Stata. If your Stata is up-to-date and the problem persists, please send a reproducible example to

            [email protected]

            Hope this helps,
            Joerg

            Comment

            Working...
            X