Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Differences between results from 'csdid' command and 'did' package in R

    Dear @FernandoRios,

    Sorry to bother! I'm hoping you might be able to help me with an issues I'm having with the 'csdid' command in Stata.

    I'm trying to implement the Callaway & Sant'Anna estimator for staggered differences-in-differences design. I have used the R package 'did' with the function 'attgt()', as well as the Stata function 'csdid'. I have about 12 different outcome variables. Essentially I can't make the results converge: for some outcome variables, results are very similar, but for others they are quite different.

    I've attached a dataset for one variable. I'm using the Stata command:

    Code:
    csdid tempO ln_GNI_pc ln_wdi_pop, ivar(ccode) time(year) gvar(firstZyear) method(dripw)
    estat group, post
    and the R function:

    Code:
    att_gt(yname = "tempO",
                      tname = "year",
                      idname = "ccode",
                      gname = "firstZyear",
                      data = raw, 
                      xformla = ~ln_GNI_pc+ln_wdi_pop+1
      ) 
     aggte(attgt, type="group", na.rm=TRUE)
    The results are a bit different and I just can't work out why. I've also experimented with the 'notyet' and 'asinr' options, which do change Stata results a bit but still aren't the same as in R. I've also experimented with all of the different 'method()' options, but again results don't converge.

    Do you have any suggestions?

    Thanks a lot!
    Rory
    Attached Files

  • #2
    Hard to say. Is the data balanced?
    what happens if you use reg as the method (outcome regression)
    are the problems for all pre and post atts?
    could you replicate this using the example dataset?

    Comment


    • #3
      Hi Fernando,

      Thanks for replying and I'm sorry for the delay in getting back to you!

      Yes, the data is balanced.

      So when I use reg for both R and Stata, the point estimates are the same actually! (although the standard errors are a little different).

      When both are set to 'ipw' or doubly robust, point estimates are different, including group averages and dynamic averages (post ATT I mean - the output from attgt() in R doesn't seem to show pre ATTs?)

      I tried to replicate the problem with the example dataset, but in that case the R/Stata output does align.

      Maybe below screenshots will help though. When set to doubly robust ('dripw' in the Stata) and I use 'estat group' (which is my aggregation of interest): the group estimates for 3 cohorts are identical between R and Stata. Only for the 2001 cohort, Stata makes an estimate while R is all 'NAs' in the output. R then gives overall average of 0.37 while Stata gives 'omitted'.

      Thanks again for your help!

      Click image for larger version

Name:	R output.png
Views:	1
Size:	23.7 KB
ID:	1759862

      Click image for larger version

Name:	Stata output.png
Views:	1
Size:	60.9 KB
ID:	1759863

      Comment


      • #4
        ok that gives the clue
        you see how Stata produces a 2001 result? but not in R? I think there are other incode decisions regarding how to use or not use data, that may be explaining the differences.
        So, unfortunately, there is nothing that can be done about it other than making an in-depth exploration for each 2x2 case, and see where differences arise.

        Comment


        • #5
          Thanks for this. Could you explain how I'd do that? Would I have to go into the code for each command?

          Comment

          Working...
          X