Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi Annie,
    1) I would suggest not using agg(simple) option, but instead just type estat simple after using csdid.
    2) technically, all this variables : audit_treat estab_size firm_size pcths i.year i.naics i.district_code_num are being interacted with year and the cohort variable (audit_fyear_csdid)
    3) If the problem occurs when you add NaICS it may be that some Naics fully explain treatment (violation of the overlapping assumption). drimp estimator (default) will fail to converge in this case
    Remember, the effective sample size with csdid is much smaller than for the standard model.
    If you type
    tab year gvar
    and choose any 4 numbers (corners of a square from that tabulation), that is your effective sample.

    4) No, it isn't possible to use some variables for the regression and some for the IPW. At least not with csdid. you could potentially do that if you reimplement the estimator.
    On the other hand, I'm not sure if the R's version has that option

    5) There is an omitted category in the event study, which is not even there!
    so, if you want pretreatment effects that are similar to the standard TWFE, you need to use the option long2
    There wont be an omitted category, but you can think of it as the one that comes between tm1 and tp0

    HTH
    Fernando

    Comment


    • Hi Samuel
      so.
      1) No there isnt a way to get those coefficients. The reason is that when you run CSDID you are estimating approximately Year X cohort models (x2 because of logits and Linear regressions if using Doubly robust)
      The only way you could get Some of those coefficients is to run drdid for each particular 2x2 DID. And save the outcomes from that command.

      2) Yes its possible, but with a slightly different procedure (since we started to make small differences in the programming)
      The default option is to produce varying base estimates.
      For the universal base, I added the option long and long2.
      long is my original interpretation of how the Pretrend should be obtained (namely estimating a treatment effect compared to the earlier periods.
      For example ATTGT(G=20, T=10) = E(Y_19-Y_10|D==1) - E(Y_19-Y_10|D==0)
      Because you are looking at how much the outcome change since T=10 to the period before treatment took place. (usually this is the T-1 period)

      Now, base universal (As defined in DID) is more similar to what you get with other DID approaches. I do this using the option long2
      Here the ATTGT(G=20, T=10) = E(Y_10-Y_19|D==1) - E(Y_10-Y_19|D==0)
      ATTGT(G=20, T=30) = E(Y_30-Y_19|D==1) - E(Y_30-Y_19|D==0)

      So ALL ATTGT's are produced using the same outcome as "base" (universal base).

      estat event will produce the appropriate graph depending if you use the default, long or long2.

      Hope this helps
      F
      Dear FernandoRios I need some clarification regarding quoted text, I did not found this in original paper of Callaway and Sant'Anna, so i am little confused
      When we estimate ATT (g,t) in post-treatment period t>=g, assuming no anticipation, Suppose ATT(G=20, T=30) we write it in potential outcome notation as:
      Code:
        ATT (G=20,T=30) = E[Y_30 - Y_19 | D=1] - E[Y_30 - Y_19 | D=0]
      This is something like
      Code:
           E[Y_t - Y_g-1 | D=1] - E[Y_t - Y_g-1 | D=0]
      which is evolution of outcome at some future data from the reference period of treatment.

      To my understanding, if one wants to write ATT for the same group ATT(G=20,T=10) for pre-treatment (t<g) period, we write :
      Code:
      ATTGT(G=20, T=10) = E(Y_19-Y_10|D==1) - E(Y_19-Y_10|D==0)
      which is evolution of outcome at the reference period of treatment from some date in past or as you rightly mentioned,
      "how much the outcome change since T=10 to the period before treatment took place", which helps us to identify any pre-trends.
      I hope, i am understanding it correct.

      However under long2 option, you write the potential outcome notation in reverse order for same ATT(G=20,T=10) , they are now written as:
      Code:
      ATT(G=20,T=10) = E(Y_10-Y_19|D==1) - E(Y_10-Y_19|D==0)
      I have two clarifications:

      1) I have a difficulty in comprehending it with pre-trend definition. Can you clarify on reversing the order of potential outcome notations under long and long2. Plz clarify

      2) Does universal base in the quoted text mean that for G=20, the ATTGT 's at each T (before treatment) are obtained using the same G-1 period's outcome as base. If yes, how does potential outcome notation in first case (long) is referred as time-varying base ?

      Comment


      • Hi Ridwan
        You are correct. the definitions i provided were not in the original paper, but are rather extrapolations i did as started to move forward with the Stata version of the method.
        so, CS2021 actually suggests that when T<G, the att is defined as:

        Code:
        ATT(G,T) = E(Y_T-Y_(T-1)|g=G) - E(Y_T-Y_(T-1)|g=0)
        or
        ATT(G=20,T=10) = E(Y_10-Y_9|g=20) - E(Y_10-Y_9|g=0)
        In other words, what he does is to set the PTA using a 1 period ahead DID.

        Now my long option is just a natural extrapolation where we look not 1 period ahead, but N periods ahead, until the last period before treatment happens
        Code:
        ATT(G,T) = E(Y_(G-1)-Y_T)|g=G) - E(Y_(G-1) - Y_T|g=0)
        or
        ATT(G=20,T=10) = E(Y_19-Y_10|g=20) - E(Y_19-Y_10|g=0)
        These are pre-trends looking forward, based on any period of interest, up to the period before treatment.

        Now, long2 is what R's DID defines as base universal because the "base" period (or the one you subtract) is always the same, regardless if T<G or T>=G

        Code:
        ATT(G,T) = E(Y_T-Y_(G-1))|g=G) - E(Y_T-Y_(G-1)|g=0)
        Long2 was really created to get a definition of ATTGT's that was consistent with how other commands identified them.
        For example, when one does event studies via OLS, you first create the dummies for the pre and post periods, but choose one to be left out. Typically G-1 or the base. Thus all other coefficients are estimated as the difference of that period outcome minus the base G-1, which is the same as long2. The other difference here is that long2 doesn't produce an omitted coefficient because it doesn't even estimate it.

        2) For the potential outcome using long. I do not have a name for it. For me was just looking at pretrends using "long" gaps rather than 1 period gaps.

        HTH
        Fernando

        Comment


        • Thank you very much FernandoRios you have explained it better.

          Now if i the write the following code:
          Code:
           
          csdid ln(y) if first_treat !=0 , ivar(id) time(year) gvar(first_treat) notyet method(dripw) saverif(A1)
          
          use A1, clear
          csdid_stats event, estore(event)
          esttab event, se
          event_plot event, default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-12(1)10) ///
              title("Callaway and Sant'Anna (2020)")) stub_lag(Tp#) stub_lead(Tm#) together
          1) What could be the STATA default in this case. Varying base or universal base?

          2) What is a proper way to tweak in the code to produce the pre-treatment estimates using either of the bases. Suppose we want pre-treatment estimates that use universal base, do we need to write something like below:

          Code:
          use A1, clear
          csdid_stats event, estore(event) long2
          esttab event, se
          event_plot event, default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-12(1)10) ///
              title("Callaway and Sant'Anna (2020)")) stub_lag(Tp#) stub_lead(Tm#) together

          Comment


          • 1) Default in Stata is the "short-gap-varying base"
            Thus
            Code:
            ATT(G,T) = E(Y_T-Y_(T-1) )|g=G) - E(Y_T-Y_(T-1) |g=0) if T<G

            2) If you use default CSDID options, the estat event (or csdid_stats) will also use default options.
            You cannot change the -pretreatment- estimation aftwards (except perhaps between long and long2 since they are just a sign flip)
            In other words, you need to reestimate the model to get different "pretrends" estimations.

            HTH

            Comment


            • Thank you FernandoRios for taking time to answer our queries, you had been very helpful.

              (Ridwan)

              Comment


              • Dear Fernando,

                Is the Post_avg coefficient (from estat event) equivalent to the ATT (from estat simple)? If not, what are the differences?
                If I want to estimate the ATT within the window(-20, 20), should I use estat cevent, window(-20, 20) ?If that's the case,is then this coefficient be equivalent to the Post_avg using estat event, window(-20,20)?

                Thank you very much!

                Alex

                Comment


                • Hi Alex
                  So, they are different, and are not equivalent.
                  1. simple ATT is , as the name states, a simple average of ALL ATTGTs after treatment takes place.
                  2. Post average is the average treatment effect of all dynamic effects, where each post period has the same weight.
                  3. estat cevent takes the average of all periods in that window. So it isn't the same as post average with estat event, because post average is ONLY for POST treatment periods.

                  HTH

                  Comment


                  • Dear Fernando,
                    Thanks for your great contribution to the CSDID package!!
                    My dataset is in loan-level, which means there are multiple pieces of loans for one firm in one year. I focus on the effect of a legislation (LEG), which is staggered issued by different states, on the loan interest rate (INT). I have some questions:
                    I use the following code for a TWFE regression:
                    reghdfe INT LEG MATURITY SIZE GDP, a (i.year i.firm i.guarantee) cluster(firm)

                    where MATURITY is the maturity of loan, which is a loan-specific variable. SIZE is the scale of asset of firm i in year t, which is a firm-year specific variable. GDP is the economic development of a state in year t, which is a state-year specific variable. In other words, all the controls are time-variated. guarantee is the type of collateral of loan, and I include it in the model as a kind of fixed effect.
                    1. Is the above regression valid as a staggered did since the sample is in loan-level rather than firm-year level panel data?
                    I use CSDID to improve the model with following code:
                    csdid INT MATURITY SIZE GDP i.guarantee, time(year) gvar(first_treat) notyet cluster(firmcd)

                    where first_treat is the first time when a state issues the legislation.
                    1. Is CSDID applicable in this data since the sample is in loan-level rather than firm-year level panel data? In my understanding, CSDID is applicable because the sample is a kind of repeated crosssection.
                    2. I have read other posts in the forum and understand that the fixed effects could be included in CSDID, but when I include it, all the output is blank and a “conformability error” appears (please find the attachment). Could you shed light on it?
                    3. I noticed in the help file that for the controls, “Only base period values are used in the model estimation”. Does it mean that the including of time-variated covariates is not helpful to improve the model? If so, how the CSDID deals with the parallel trends conditional on observed time-variated covariates?
                    My questions may be silly. Great appreciate any helpful answers from you!

                    Attached Files

                    Comment


                    • Hi @FernandoRios,

                      Thank you for the great package. I have a question regarding the aggregated event study coefficients. Is there a way to get event study coefficients for a subset of cohorts? I am working on a county-level panel dataset, where the adoption of a policy varies from say 2005 to 2013. Some counties were selected as pilot counties and adopted the policy in years 2005, 2006, and 2007. The rest non-pilot counties adopted the policy in years 2010-2013. Is it possible to estimate two event studies, one for pilot counties (cohorts 2005, 2006, and 2007), and the other for non-pilot counties (cohorts 2010 to 2013)? I want two event studies separately to see if the effect evolves differently for pilot and non-pilot counties. Thank you very much!

                      Best,
                      Yupei

                      Comment


                      • hi Chunxiao Geng

                        1) Im not sure what guarantee is, not what is your treatment variable.
                        In any case, if you are trying to use twfe approach, you may still want to have a variable for year of first treatment and periods integrated in your model.
                        You do not need to control for firm fixed effects in that case, only for the cohort (or year of first treatment) Or at least that is how I understand Wooldridge approach.

                        2) For your use of first_treat, Im still unsure which variable in twfe relates to first treat. I do think you can CSDID as a kind of repeated crossection, not a panel because you do not have a formal panel dataset. Unless you assume that each firm/loan is a different panel unit.

                        3) You can add fixed effects (as dummies) but with caveats.
                        - you need both already treated and not yet treated(never treated) for each subgroup within each panel. in your case, you need both treated and untreated units for each value of guarantee. Otherwise, you are failing the overlapping assumption
                        - Even if you have treated/untreated observations per level of guarantee, you also need enough observations to identify all other variables, especially for drimp and dripw methods. The reason for this is that each time CSDID runs a specific 2x2 DID, you have only a fraction of observations in each model. Thus, you may simply not have enough data to identify coefficients for all variables in your model. This bites a lot when you use logits or inverse probability tilting specifications
                        - because of the above, you are getting NO results, so there is nothing to summarize. that is why, I believe, you are getting the reported error

                        4) If you are using panel data estimators, only Base period values are used in the estimation. In fact based on CS, and the example he provides and uses, he starts with the assumption that all controls are time fixed. So, it doesn't matter which period data you use, it will have the same impact.

                        Empirically, unless you transform your data, the next best thing was to use the base period values for the regressions, when you look forward (Estimate post-treatment ATT's). All data after is simply not used, because it may be contaminated with effects from the treatment (which you want to avoid).

                        When you have repeated crossection, you cannot do that, because you do not have data for other periods but current period. Thus CS simply imposes the assumption that the data is either stationary or as good as fixed.

                        you can try using data that happens after treatment was introduced, and in fact a later paper by Callay and coauthors (came this year but do not recall the title), suggests doing this, with the caveat that it may have large consequences estimating ATT's

                        Hope this helps
                        F




                        Comment


                        • Hi Yupei Ma

                          Currently, there is no way to do that, other than explicitly selecting your sample before the estimation is done.

                          For example you could:

                          csdid y x1 x2 x3 if inlist(first_treat,0,2005,2006,2007)
                          csdid y x1 x2 x3 if inlist(first_treat,0,2010,2011,2012,2013)

                          and get the event effects separately. Unfortunately, this won't allow you to test differences across samples.

                          That being said, it may be possible to do what you want, but requires more programming, or use nlcom.

                          for example, you could estimate "manually" the Tp0 effect for only cohorts 2005 2006 2007 and separately for 2010, 2011, 2012 2013. and then contrast their difference.
                          HTH
                          F

                          Comment


                          • Hi @FernandoRios,

                            Thank you so much for the explanation! It is very helpful. I have a follow-up question if you don't mind. When you say I can "manually" estimate the Tp0 effect for only cohorts 2005 2006 2007 and separately for 2010, 2011, 2012 2013, do you mean that I should 1) aggregate the group-time ATT at Tp0 for cohorts 2005-2007 (or 2010-2013) using the right weights (something similar to Table 1 in CS2020); and 2) use the "nlcom" command and the formula in step 1 to estimate? Thank you very much!

                            Best,
                            Yupei

                            Originally posted by FernandoRios View Post
                            Hi Yupei Ma

                            Currently, there is no way to do that, other than explicitly selecting your sample before the estimation is done.

                            For example you could:

                            csdid y x1 x2 x3 if inlist(first_treat,0,2005,2006,2007)
                            csdid y x1 x2 x3 if inlist(first_treat,0,2010,2011,2012,2013)

                            and get the event effects separately. Unfortunately, this won't allow you to test differences across samples.

                            That being said, it may be possible to do what you want, but requires more programming, or use nlcom.

                            for example, you could estimate "manually" the Tp0 effect for only cohorts 2005 2006 2007 and separately for 2010, 2011, 2012 2013. and then contrast their difference.
                            HTH
                            F

                            Comment


                            • Exactly
                              in fact if you type
                              ereturn display
                              after csdid, you will see the weights I use to estimate those aggregates
                              hth
                              fernando

                              Comment


                              • Got it. Thank you very much!

                                Yupei

                                Comment

                                Working...
                                X