Cluster-Robust Standard Errors

Maxence Morlet

Join Date: Mar 2021

Posts: 650
#1

Cluster-Robust Standard Errors

22 Dec 2022, 08:24

Hi all,

I stumbled upon this webpage on Stata's website: https://www.stata.com/support/faqs/s...luster-option/.

Please correct me if I'm wrong, but I am under the impression that when Stata users type

Code:

reg y x, r

They get HC1 standard errors as in Hinkley, 1977, correct?

However, if they type vce(hc3) they will obtain HC3 standard errors as in MacKinnon and White (1985), which have been shown by Long and Ervin (2000) to outperform HC0, HC1, HC2 and HC4 in terms of size properties.

My question is the following: in panel data, researchers often invoke (according to guidelines set out by Abadie, Athey, Imbens and Wooldridge (2022)) cluster-robust standard errors, which according to Stata's website are "simply that of the robust (unclustered) estimator [HC1] with the individual ei*xi’s replaced by their sums over each cluster."

Is it possible to invoke the cluster option and combine it with the HC3 standard error formula? Would it make sense econometrically?

Apologies if the question is non-sensical from a statistical point of view.
Tags: None

1 like
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#2

22 Dec 2022, 08:28

Presumably, Jeff Wooldridge would give the best comments on this
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 650
#3

22 Dec 2022, 08:38

It would be amazing to get Prof. Wooldridge's response it is also quite an important question for quite a few applied researchers who use Stata I reckon, as panel data is becoming more and more available and valid inference is quite crucial in most fields of study.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#4

22 Dec 2022, 09:01

This is a very good question, and not a very good explanation on the Stata website.

The HC1, HC2, HC3 make various adjustments to the residual which are supposedly resulting in better small sample properties. One of those uses the leverage (the hat matrix elements) another one something else, etc.

The cluster robust standard errors do not make any adjustment to the residual, they just use the residual as it is.

Using the leverage adjusted and all other adjusted standard errors/variances is very easy in Stata, with the programmers command - _robust -.

Whether using adjustments such as in HC2 and HC3 in the cluster robust variance would lead to substantial improvements, is an open research question. To my knowledge, there is no paper on the topic.
2 likes
Comment
Enrique Pinzon (StataCorp)

StataCorp Employee

Join Date: Jan 2015

Posts: 215
#5

22 Dec 2022, 09:53

Hi Maxence,

The short answer to your question is YES. It makes a lot of sense. Matt Webb, James MacKinnon, and their coauthors have been working on these topics and have Stata code for it. There is also some interesting results from Bruce Hansen.

For Matt's results, please see:

https://www.statalist.org/forums/for...bust-inference

For what Bruce advocates for:

https://www.ssc.wisc.edu/~bhansen/papers/tcauchy.html

Also, note that to get what you want you can use

Code:

vce(jackknife,mse)
2 likes
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 650
#6

22 Dec 2022, 09:56

Thank you very much for your responses. I'll read up on the suggested papers!
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#7

22 Dec 2022, 11:33

Enrique Pinzon (StataCorp) should also have cited his recent blog post: https://blog.stata.com/2022/10/06/he...considerations (although he does not mention panels in the blog, there is still a lot to think about)
3 likes
Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3047

23 Dec 2022, 01:53

Originally posted by Enrique Pinzon (StataCorp) View Post

Hi Maxence,

The short answer to your question is YES. It makes a lot of sense. Matt Webb, James MacKinnon, and their coauthors have been working on these topics and have Stata code for it. There is also some interesting results from Bruce Hansen.

For Matt's results, please see:

https://www.statalist.org/forums/for...bust-inference

For what Bruce advocates for:

https://www.ssc.wisc.edu/~bhansen/papers/tcauchy.html

Also, note that to get what you want you can use

Code:

vce(jackknife,mse)

Hi Enrique, you have done an awesome blogpost in what Rich referred to in #7 ! Also thank you for the awesome summary of the recent literature.

Can you please elaborate how we can do the cluster jackknife recommended by Bruce Hansen in Stata?

It seems to me that Stata allows either cluster, or jackknife at the individual observation level.

E.g.,

Code:

. sysuse auto
(1978 automobile data)

. reg price mpg, vce(cluster rep)

Linear regression                               Number of obs     =         69
                                                F(1, 4)           =       7.50
                                                Prob > F          =     0.0519
                                                R-squared         =     0.2079
                                                Root MSE          =     2611.4

                                  (Std. err. adjusted for 5 clusters in rep78)
------------------------------------------------------------------------------
             |               Robust
       price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -226.3607   82.63217    -2.74   0.052    -455.7843    3.063024
       _cons |   10965.23   1591.972     6.89   0.002     6545.205    15385.25
------------------------------------------------------------------------------

. reg price mpg, vce(jackknife, mse)
(running regress on estimation sample)

Jackknife replications (74)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
..................................................    50
........................

Linear regression                                    Number of obs =        74
                                                     Replications  =        74
                                                     F(1, 73)      =     14.96
                                                     Prob > F      =    0.0002
                                                     R-squared     =    0.2196
                                                     Adj R-squared =    0.2087
                                                     Root MSE      = 2623.6529

------------------------------------------------------------------------------
             |              Jknife *
       price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -238.8943   61.75999    -3.87   0.000    -361.9818   -115.8069
       _cons |   11253.06   1451.724     7.75   0.000      8359.78    14146.34
------------------------------------------------------------------------------

But how can we do the jackknifing at the cluster level? It does not seem like Stata would allow to use both, that is jackknifing at the cluster level?

Last edited by Joro Kolev; 23 Dec 2022, 01:57.

Comment

Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#9

23 Dec 2022, 03:35

Originally posted by Joro Kolev View Post

how can we do the jackknifing at the cluster level?

.ÿ
.ÿversionÿ17.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿquietlyÿsysuseÿauto

.ÿ
.ÿ//ÿseedem
.ÿsetÿseedÿ1814593309

.ÿsummarizeÿrep78,ÿmeanonly

.ÿquietlyÿreplaceÿrep78ÿ=ÿruniformint(r(min),ÿr(max))ÿifÿmissing(rep78)

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿjacknifeÿ_b[mpg]ÿ_b[_cons],ÿeclassÿcluster(rep78)ÿmse:ÿregressÿpriceÿc.mpg
(runningÿregressÿonÿestimationÿsample)

Jackknifeÿreplicationsÿ(5)
----+---ÿ1ÿ---+---ÿ2ÿ---+---ÿ3ÿ---+---ÿ4ÿ---+---ÿ5ÿ
.....

LinearÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿ74
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿReplicationsÿÿ=ÿÿ5

ÿÿÿÿÿÿCommand:ÿregressÿpriceÿc.mpg
ÿÿÿÿÿÿÿÿ_jk_1:ÿ_b[mpg]
ÿÿÿÿÿÿÿÿ_jk_2:ÿ_b[_cons]
ÿÿÿÿÿÿÿÿÿÿn():ÿe(N)

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ(Replicationsÿbasedÿonÿ5ÿclustersÿinÿrep78)
------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿJknifeÿ*
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿCoefficientÿÿstd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿ_jk_1ÿ|ÿÿ-238.8943ÿÿÿ92.69114ÿÿÿÿ-2.58ÿÿÿ0.062ÿÿÿÿ-496.2462ÿÿÿÿ18.45752
ÿÿÿÿÿÿÿ_jk_2ÿ|ÿÿÿ11253.06ÿÿÿ1726.235ÿÿÿÿÿ6.52ÿÿÿ0.003ÿÿÿÿÿ6460.265ÿÿÿÿ16045.86
------------------------------------------------------------------------------

.ÿ
.ÿexit

endÿofÿdo-file

.
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#10

23 Dec 2022, 03:49

You can also look into the following alternatives.

Code:

xtreg price c.mpg, i(rep78) vce(jackknife, mse) fe xtgee price c.mpg, i(rep78) family(gaussian) link(identity) corr(independent) vce(jackknife, mse)

Here, the syntax that Enrique showed works directly, as-is.
1 like
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#11

07 Feb 2024, 02:39

Originally posted by Enrique Pinzon (StataCorp) View Post

Also, note that to get what you want you can use

Code:

vce(jackknife,mse)

Very useful, thank you.
I have however one doubt. Why should one use the "mse" option? If I well understand, this implies centering at the full sample estimator, instead of at the sample estimator excluding the specific cluster each time. Isn't the HC3 standard error in MacKinnon and White (1985) instead calculated by centering at the delete-one-cluster estimator?
Comment

Announcement