Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • IV(2SLS) regression with fixed effects and two-way clustering

    Hello,

    I tired IV(2SLS) regression with STATA, but I do not know whether what I do is right or not.

    Please let me know better solutions.

    Fixed effects: year fixed effect and id-code fixed effect
    Clustering: standard errors clustered at the region and year levels.



    egen id1= group(id)
    egen code1 = group(code)
    egen region1 = group(region)
    egen double_fixed = group(id1 code1)
    egen double_cluster = group(region1 year)

    1)

    ivreghdfe Y X1 X2 X3 (endogenous variable=instrument variable), absorb(i.year id1 code1) cluster(region1 year)

    2)

    ivreghdfe Y X1 X2 X3 (endogenous variable=instrument variable), absorb(i.year double_fixed) cluster(double_cluster)

    3)

    xtset double_fixed
    xtreg endogeneous_variable = Instrument_variable X1 X2 X3 i.year, fe
    predict endo_var_hat, xb
    xtreg Y endo_var_hat X1 X2 X3 i.year, fe

    4)

    xtreg endogeneous_variable = Instrument_variable X1 X2 X3 i.year, fe vce(cluster double_cluster): error - panels are not nested within clusters

    Which one is correct? any better solution?
    Last edited by SeongMyeong Kang; 05 Nov 2019, 21:58.

  • #2
    typo: vreghdfe -> ivreghdfe (revised)
    Last edited by SeongMyeong Kang; 05 Nov 2019, 21:59.

    Comment


    • #3
      I have the same problem and do not know which way is correct. Can somebody please explain this?

      Comment


      • #4
        ivreghdfe is from https://github.com/sergiocorreia/ivreghdfe (FAQ Advice #12).

        For two-way clusters, you need two variables, so

        ivreghdfe Y X1 X2 X3 (endogenous variable=instrument variable), absorb(i.year id1 code1) cluster(region1 year)
        is correct. All the rest are one-way clusters at different levels. Two-way clustering appears to be applied in an ad-hoc manner since statistical software packages made it accessible. However, there needs to be a theoretical basis for this, so it's important not to employ it simply because it's available. In the case of panel data, clustering using the panel identifier is usually sufficient.

        Comment

        Working...
        X