Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hey George Ford , can you talk a little more on what you mean for the inference procedure? I knew something wasn't right about the CIs or SEs, but I couldn't figure what it was. The R2 and ATT match precisely the results from HCW, but the inference is off by only just a little. Here is the relevant matlab part of Kathy's code
    Code:
    Omega_1_hat_FDID=(t2/t1)*mean(u1_FDID.^2);% \hat\Sigma_{1,FDID}
    
    Omega_2_hat_FDID=mean(u1_FDID.^2);        % \hat\Sigma_{2,FDID}  
    
    std_Omega_hat_FDID=sqrt(Omega_1_hat_FDID+Omega_2_hat_FDID);
    
    % square-root of \hat Sigma^2_{FDID}
     
    ATT_std_FDID=sqrt(t2)*ATT_FDID/std_Omega_hat_FDID
    
    % standardized ATT, it is N(0,1) under H0: ATT = 0
    
    p_value_forward_DID=2*(1-normcdf(abs(ATT_std_FDID))) % p-value for ATT=0
    
    p_value_f_one_sided=(1-normcdf(ATT_std_FDID)) % p-value for 1-sided test
    
    CI_95_FDID_left= ATT_FDID-1.96*std_Omega_hat_FDID/sqrt(t2);
    
    CI_95_FDID_right=ATT_FDID+1.96*std_Omega_hat_FDID/sqrt(t2);
    
    CI_95_FDID_width=[CI_95_FDID_left,CI_95_FDID_right,CI_95_FDID_right-CI_95_FDID_left]
    The matlab script is a little ugly, but it it's what the original code was written for.


    I also agree on adding DID as an option just for comparison, as well as making it sample size specific. I'll do that once I'm okay with the inference as it is currently


    EDIT: for reference, the CI for HCW is (according to Kath's replication file)

    Code:
    MATLAB Sample output: 
    Number_controls_selected_by_FDID = 9
    
    ATT_FDID =  0.025405
    ATT_FDID_per = 53.843
    R2_forward_DID = 0.84278
    
    ATT_std_FDID = 5.4941
    p_value_forward_DID =  3.9274e-08
    p_value_fDID_one_sided =  1.9637e-08
    Last edited by Jared Greathouse; 14 Jul 2024, 09:32.

    Comment


    • #17
      George Ford after breaking it down step by step, I finally got the HCW results to replicate (the inference I mean, the CIs and SE)

      Code:
      clear *
      
      
      cls
      
      //net install fdid, from("https://raw.githubusercontent.com/jgreathouse9/FDIDTutorial/main") replace
      
      
      
      
      clear *
      
      u "https://github.com/jgreathouse9/jgreathouse9.github.io/raw/master/stata/fdid/hcw.dta"
      
      cls
      qui fdid gdp, tr(treat)  unitnames(state) //gr2opts(scheme(sj) name(hcwte, replace)) 
      
      cwf fdid_cfframe
      
      g residsq = te^2 if eventtime <0
      su residsq, mean
      cls
      scalar o1hat=(17 / 44)*(r(mean))
      di `o1hat'
      // 0.00010130090173830751
      
      su residsq, mean
      scalar o2hat = (r(mean))
      di scalar(o2hat)
      // 0.00026219056920503124
      
      scalar ohat = sqrt(scalar(o1hat)+scalar(o2hat))
      // 0.019065452287930093
      di scalar(ohat)
      
      di 0.025 - (((invnormal(0.975) * scalar(ohat)))/sqrt(17))
      
      di 0.025 + (((invnormal(0.975) * scalar(ohat)))/sqrt(17))
      The results round off obviously (for Stata), but THESE are the exact correct numbers. Now I just need to automate it

      Comment


      • #18
        Hey everyone. I've extended fdid to the multiple treated unit setting. For an example, you can do

        Code:
        clear *
        
        qui net inst fdid, from("https://raw.githubusercontent.com/jgreathouse9/FDIDTutorial/main") replace
        
        u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/hcw.dta", clear
        
        replace treat = 1 if state == "austria" & time >20
        
        fdid gdp, tr(treat) unitnames(state) 
        
        return list
        We have the treatment effect frame returned in the "multiframe", as well as a series matrix of the same should you prefer. For the results matrix, we have the ATT and the total treatment effect (that is, the sum of all individual treatment effects after the intervention across all units. Note that this may or may not be meaningful in every single context.

        Comment


        • #19
          for r(series), I might limit it to the the treated outcome(s), the counterfactual, the te, and eventtime. but that's just a personal preference. if you wanted to svmat it, there will be a lot of information you may not want.

          I think there's still an issue with the se and the CI.

          not using austria as a treated unit, the results are

          PHP Code:

          Forward Difference
          -in-Differences    T0 R2:    0.843     T0 RMSE:    0.016
                          
          gdp       ATT        t    SE    
          [95ConfInterval]    p
                          
          treat    0.02540     2.803    0.00906    0.01634     0.03447    0.005 
          0.03447 - 0.02540 = 0.00907 = se.

          The CI should be, me thinks,

          0.02540 +- 1.96*0.00907.

          But, that depends on whether the CI is correct and the se is incorrect, or vice versa.




          Comment


          • #20
            When I look at the replication file for FDID (from the original MATLAB code), the CI that Kathy has in her replication file is 0.016342, 0.034468.

            When I run the newest version of FDID, here's what I'm returned

            Code:
            Forward Difference-in-Differences          T0 R2:    0.843     T0 RMSE:    0.016
            -------------------------------------------------------------------------------------------
                     gdp |      ATT        t           SE         [95% Conf. Interval]     p
            -------------+-----------------------------------------------------------------------------
                   treat |   0.02540     2.803      0.00906      0.01634     0.03447    0.005
            -------------------------------------------------------------------------------------------
            This is just an issue of rounding, no? or maybe one of precision?

            Comment


            • #21
              for r(series), I might limit it to the the treated outcome(s), the counterfactual, the te, and eventtime
              oh jesus you're right, I can't believe I missed that!

              EDIT: Now I kept only the relevant things!
              Last edited by Jared Greathouse; 15 Jul 2024, 13:11.

              Comment


              • #22
                Then, I think the se is wrong.

                Should be: (0.03447 - 0.0254)/1.96 = 0.00462755

                the se = std_Omega_hat_FDID/sqrt(t2)

                Comment


                • #23
                  You list:
                  Code:
                   
                   scalar ohat = sqrt(scalar(o1hat)+scalar(o2hat)) // 0.019065452287930093
                  0.0190654522879300093/sqrt(17) = 0.0046

                  Comment


                  • #24
                    the only evidence I can find for why it's like this, is in the appendix. Kathy writes that she "estimates omega by the Newey-West auto-correlation robust estimator". I kind of wish this was emphasized more in the main text, but i guess that's why it's like that

                    Comment


                    • #25
                      The se and the confidence intervals need to match up, don't they?

                      I think [scalar ohat = sqrt(scalar(o1hat)+scalar(o2hat))] is the standard deviation of Omega, so the standard error needs to be adjusted by the sqrt(t2). When you do so, the CI and the se are squared up.

                      I don't see the t-stat being computed in the original code. That's your calculation. Maybe I'm missing something, as I haven't studied this. But when I look at the Li paper, the CI is computed as ATT +- 1.96*sqrt(Omega/t2), so sqrt(Omega/t2) is the effective se. In the code above, the sqrt(Omega) is already taken, so just need to divide by sqrt(t2).

                      Last edited by George Ford; 15 Jul 2024, 14:37.

                      Comment


                      • #26
                        You see the same here:

                        di 0.025 - (((invnormal(0.975) * scalar(ohat)))/sqrt(17))
                        di 0.025 + (((invnormal(0.975) * scalar(ohat)))/sqrt(17))

                        Comment


                        • #27
                          Oh now I see! I was wondering why the CIs weren't lining up, the SE was too big for the CIs to be what they were, so while the CIs were correct, the SE wasn't as I'd calculated it separately.

                          What I'll do then is ATT+- 1.96*SE, where SE is a scalar, that way, I don't get it mixed up. Now, I do get the 0.00462 result
                          Last edited by Jared Greathouse; 15 Jul 2024, 15:17.

                          Comment


                          • #28
                            I don't know if you'd know the answer, but how might this work when we have more than 1 treated unit? Say we have 2. In the paper, Kathleen doesn't say how the SEs would work here, she only discusses the ATTs, but not about the inference in the multiple treated unit setting

                            Comment


                            • #29
                              nevermind.

                              Comment


                              • #30
                                I haven't pushed it to github yet

                                as of now, when I do

                                Code:
                                clear *
                                qui net inst fdid, from("https://raw.githubusercontent.com/jgreathouse9/FDIDTutorial/main") replace
                                u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/hcw.dta", clear
                                fdid gdp, tr(treat) unitnames(state) // gr2opts(scheme(plotplain)) gr1opts(scheme(plotplain)) // name(hcwte, replace))
                                I get

                                Code:
                                Forward Difference-in-Differences          T0 R2:    0.843     T0 RMSE:    0.016
                                -------------------------------------------------------------------------------------------
                                         gdp |      ATT        t           SE         [95% Conf. Interval]     p
                                -------------+-----------------------------------------------------------------------------
                                       treat |   0.02540     5.494      0.00462      0.01634     0.03447    0.000
                                -------------------------------------------------------------------------------------------
                                Treated Unit: hongkong
                                FDID selects philippines, singapore, thailand, norway, mexico, korea, indonesia, newzealand, malaysia, as the optimal donors.
                                Standard Errors are Newey-West Standard Errors.
                                See Li (2024) for technical details.
                                Last edited by Jared Greathouse; 15 Jul 2024, 15:38.

                                Comment

                                Working...
                                X