Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Yes, there are a couple of ways
    Option one, windows() option (undocumented I believe)

    estat pretrend, window(-5 -1)
    would take the pretrend test only up to -5

    The other option is just to do the test yourself
    after csdid
    just identify all pretreatment ATTs, and just use "test" to see the effects

    HTH

    Comment


    • Originally posted by FernandoRios View Post
      Yes, there are a couple of ways
      Option one, windows() option (undocumented I believe)

      estat pretrend, window(-5 -1)
      would take the pretrend test only up to -5

      The other option is just to do the test yourself
      after csdid
      just identify all pretreatment ATTs, and just use "test" to see the effects

      HTH
      Thank you so much for your reply!

      From my understanding this means that the two following commands should essentially estimate the same thing?

      1.
      estat event, post
      test Tm5 Tm4 Tm3 Tm2 Tm1

      2.
      estat pretrend, window(-5 -1)

      However, with command #1 Prob > chi2 = 0.7771 and with command #2 the p-value is 0.0000.
      How should one interpret these differences? From my understanding the first command shows possible parallell pre trends, while #2 does not.

      All the best,
      Katarina

      Comment


      • They are similar but not equivalent
        pretrend test is a joint test with the null that ahh pretreatment attgts are equal to 0
        my other code tests whether the aggregated pretrend a are equal to zero.
        consider a simple case of two groups
        g1 has a attgt of -10 at t-2
        and g2 of 10 at t-2
        indidually they are significa different from zero
        but the average will
        not
        hth
        fernando

        Comment


        • Hi @FernandoRios,

          I want to know how to run the csdid command with millions of observations.

          There are 6 million observations in my dataset. Whenever I execute this command, the stata program will crash directly.

          Ram is 32GB.

          Sincerely,

          Chengjun Wu.
          Last edited by Chengjun Wu; 07 Mar 2023, 20:04.

          Comment


          • Is this cross section or panel
            and how many periods and cohorts you have

            Comment


            • Hi FernandoRios, is it possible to bin your endpoints with csdid? I want to produce an event study but instead of having 10 leads and lags, I would like to bin together everything after 5.

              Comment


              • Yes, but not all at once.
                the option
                estat cevent, window(5 10)
                would give you the "binned" average between T+5 and T+10.
                There is no current option to show all events + binned events

                Comment


                • Hi FernandoRios,
                  Thanks again for all the work with the csdid package and all your helpful answers in this thread!

                  I’m using the csdid command with the dripw-method, i.e. the doubly robust DiD estimator based on IPW and OLS on a panel data set.

                  However, I have two questions:
                  1) When trying to use inverse probability weighting on panel data with a regular TWFE-model, you get the error message “weights must be constant within id”. This is not an issue in the csdid-package. Why is that?
                  2) If you want to compare the estimates between the Callaway & Sant’Anna model (using dripw) and the TWFE-model, how would you do that?

                  All the best,
                  Katarina

                  Comment


                  • Mmm
                    so for q1. Csdid in this case would be closer to what reghdfe does not what Xtreg does, so weighing is not a problem
                    also csdid just cares about weights to be consistent in the 2x2 scenario not all periods.
                    q2. Csdid long2 would be the closer one in terms of standard twfe leads and lags
                    however, I do not know of any way to make a formal model comparison, except for assuming independence or using bootstrap

                    Comment


                    • Hi FernandoRios,

                      I use CSDID to estimate the effect of the state laws on the outcome. I cluster standard errors at the state level, so I end up with 35 clusters. Is the number of clusters sufficient for the csdid to produce unbiased estimates? Do I also need a minimum number of clusters in each group (group refers to the year when the law was introduced and each group includes one or more states; there are 7 groups)?

                      Here is my regression:

                      csdid outcome , cluster(STATE) time(calendar_year) gvar(first_treat) method(drimp) agg(simple) notyet


                      Thank you in advance.
                      Iryna

                      Comment


                      • Most papers would sy no. That is probably not enough to estimate standard errors. However, that is as much as you have, so you will have to go ahead with it.

                        Comment


                        • Originally posted by FernandoRios View Post
                          Is this cross section or panel
                          and how many periods and cohorts you have
                          It's an unbalanced panel data with 739046 groups and 14 years. Total observation is 6811375. STAtA crash when i use "csdid" command.

                          Comment


                          • You may be able to do it using csdid2 ( please search online the GitHub page)

                            Comment


                            • Dear Fernando,
                              I use a fully balanced panel data set of municipalities in my study.

                              I read in a previous post that when you request panel estimators (using ivars), the standard errors are implicitly clustered at the panel id level. That is why I have been using ivar.
                              However, I can now see that there also is a clustering option. Should I rather use that? Should you always use that for municipal data?
                              How should I understand the differences between the two commands?

                              My observations are at the municipal level, meaning characteristics may change over time.

                              All the best,
                              Katarina

                              Comment


                              • Originally posted by Katarina Sandberg View Post
                                Dear Fernando,
                                I use a fully balanced panel data set of municipalities in my study.

                                I read in a previous post that when you request panel estimators (using ivars), the standard errors are implicitly clustered at the panel id level. That is why I have been using ivar.
                                However, I can now see that there also is a clustering option. Should I rather use that? Should you always use that for municipal data?
                                How should I understand the differences between the two commands?

                                My observations are at the municipal level, meaning characteristics may change over time.

                                All the best,
                                Katarina
                                The reason why I am asking is that I get extremely different estimates when using the two options, where the estimates from ivar seem much more realistic.

                                When using the code for ivar:
                                csdid schoolcostlog populationlog populationdensity shareofpop1619, ivar(municipality) time(year) gvar(first_treat) method (dripw)

                                The coefficients for the first 7 years is -0.04 and the significance level 0.007 (using estat cevent, window(0 7))

                                When using the code for cluster:
                                csdid schoolcostlog populationlog populationdensity shareofpop1619, cluster(municipality) time(year) gvar(first_treat) method (dripw)

                                The coefficients for the first 7 years is -0.32 and the significance level 0.09 (using estat cevent, window(0 7))

                                The coefficient with cluster is way too large for what I am trying to estimate.

                                Comment

                                Working...
                                X