Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clustered standard errors with group-specific slope parameters

    Hello,

    I am trying to estimate a regression with 746 city fixed-effects and a distinct slope-parameter for each city. I would like to cluster the standard errors by city. I am using Stata 16.

    The results seem fine when I use no option or the robust option to compute standard errors. When I try to cluster them by city, the standard errors get extremely small, to an extent that makes me suspect that something must be incorrect.

    I was able to recreate the same behavior with sample data.

    Code:
    sysuse auto, clear
    
    reg price c.weight#i.foreign i.foreign, robust
    *This looks reasonable
    
    reg price c.weight#i.foreign i.foreign, vce(cluster foreign)
    *This produces incredibly small standard errors
    
    xtset foreign
    xtreg price c.weight#i.foreign, fe vce(robust)
    *This returns the same slope coefficients as above, but does not display standard errors at all
    While the sample data has only two groups, my actual data shows the same behavior with 746 groups. Restricting the sample to cities with at least 1000 observations does not change anything either, so it does not appear to be driven by a small number of observations.

    What am I doing wrong?

  • #2
    Bernhard:
    welcome to this forum.
    the way you coded interaction is not correct, as it shoud be (I'm taking -regress- as an example):
    Code:
    sysuse auto, clear
    
    reg price c.weight c.weight#i.foreign i.foreign, robust
    or, in a more efficient way:
    Code:
    sysuse auto, clear
    
    reg price c.weight ##i.foreign, robust
    Double-check whether or not this fix makes your results more resonable (whatever it may mean).

    In addition:
    1) under -regress- the -robust- option takes heteroskedasticity only into account. To take serial correlation into account, go -vce(cluster clusterid)-;
    2) conversely, -robust- and -vce(cluster clusterid)- do the very same job under -xtreg-, as they both call the cluster-robust standard error.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Hello Carlo,

      Thank you very much for the answer and the welcome!

      I am trying to understand why the explicit inclusion of the base-effect of c.weight is necessary. For the examples I tried, I get equivalent results.

      reg price c.weight#i.foreign i.foreign reg price c.weight c.weight#i.foreign i.foreign Comparison
      _b[0b.foreign#c.weight] _b[weight] same coefficient, same standard error
      _b[1.foreign#c.weight] _b[weight] + _b[1.foreign#c.weight] same coefficient, can compute same
      standard error using lincom

      In the examples I tried, this is true regardless of using standard or heteroskedasticity robust standard errors. It also holds if I include controls and if the categorical variable has more than two categories. All coefficients and standard errors not depicted in the table (constant, fixed effects) are also the same. This is something rather fundamental about regressions in Stata, so I would be very happy to understand when this equivalence breaks down / why my version is not correct.

      Interestingly, the standard errors do indeed somewhat change with the explicit inclusion of the base effect when I cluster them using vce(cluster foreign). (They stay the same if I include an additional control). However, my issue stays qualitatively the same. To give one example, when I run
      Code:
      sysuse auto, clear
      
      reg price c.weight c.weight#i.foreign i.foreign
      with different options for standard errors and look at the t-value of weight, it is 7.19 with no option, 5.52 with the robust option and 1.5e+15 with the vce(cluster foreign) option. This makes me suspect that there must be something wrong with my clustering.

      Comment


      • #4
        You can’t cluster to obtain standard errors for the fixed effects, whether they’re intercepts or slopes. Clustering only works for parameters assumed constant across city.

        Comment


        • #5
          Thank you very much! That is very good to know and explains my confusion.

          Can I bootstrap the standard errors using vce(bootstrap)? I tried and the average bootstrapped standard error is slightly smaller than the average robust standard error in my example, while being of the same magnitude. I am unsure whether this is due to my particular sample, or whether standard bootstrapping is also not a good (conservative) choice in a case with city specific slopes and intercepts.

          Comment


          • #6
            Unless you are specifying a cluster structure when you bootstrap, you are just obtaining heteroskedasticity-robust standard errors -- which is why they produce something close to vce(robust). You're basically estimating a different equation for each city, right? Then all you can do is use heteroskedasticity-robust standard errors for each city. From you description, it seems you don't have panel data but, perhaps, people living within cities? It would be like clustering with one cluster. I think reg lets you do it because it doesn't recognize the degeneracy. xtreg does, and that's why your standard errors are missing.

            Comment


            • #7
              Thanks again, this is extremely helpful! Exactly. I have rental objects in different cities and I am interested in estimating a slope parameter (distance to the city center) and an intercept for each individual city. No time variation. Right now I am estimating them all with one regression. I thought of one regression vs. several regressions as a question of whether I expect the control variables to have a common effect across cities, or quite different effects in different cities. Moreover, it allows me to easily compare the intercepts. But you are right, it is very close to estimating one equation per city.

              Comment

              Working...
              X