Cluster standard errors by firm and year

Patrick Johannes

Join Date: Dec 2020

Posts: 40
#1

Cluster standard errors by firm and year

10 Dec 2020, 00:23

Hello all,

I am running a FE regression using the LSDV approach (industry and year FE) and i want to cluster standard errors by firm and year.
Can i group them? My code is the following:

Is this the correct way to do it?
Appreciate your help

Code:

xtset id year . egen double_cluster = group (company_key year) . regress firm_beta_w esg_single_lag1 i.industry_key i.year, vce (cluster double_cluster)
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17352
#2

10 Dec 2020, 01:22

Patrick:
welcome to this forum.
Some comments about your post:
- there's no need to -xtset- your data if you decide to go (pooled) OLS:
- the community-contributed command -reghdfe- allows multi-way Clustering.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Patrick Johannes

Join Date: Dec 2020

Posts: 40
#3

10 Dec 2020, 01:34

Hi Carlo,

thx for your fast answer. Is this not the right way to implemet industry and year fixed effects (i.industry and i.year) since i cannot go xtset industry year since industry cannot be assigned to one year thus there are multiple observations.
I wanted to do thie fixed effects by estimating industry and year dummies.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17352

10 Dec 2020, 03:03

Patrick:
if you have multiple observations for the same panel per each year (and Stata throw a warning message about that), you can simply -xtset- your dataset with -panelid- only.
However, this fix comes at cost of ruling out the chance to use time-series related commands, such as lags and leads.
I do hope that the following example can be helpful:

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtset idcode
       panel variable:  idcode (unbalanced)

. xtreg ln_wage c.age##c.age i.year, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1162                                         min =          1
     between = 0.1078                                         avg =        6.1
     overall = 0.0932                                         max =         15

                                                F(16,4709)        =      79.11
corr(u_i, Xb)  = 0.0613                         Prob > F          =     0.0000

                             (Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0728746    .013687     5.32   0.000     .0460416    .0997075
             |
 c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
             |
        year |
         69  |   .0647054   .0155249     4.17   0.000     .0342693    .0951415
         70  |   .0284423   .0264639     1.07   0.283    -.0234395     .080324
         71  |   .0579959   .0384111     1.51   0.131    -.0173078    .1332996
         72  |   .0510671   .0502675     1.02   0.310    -.0474808     .149615
         73  |   .0424104   .0624924     0.68   0.497    -.0801038    .1649247
         75  |   .0151376    .086228     0.18   0.861    -.1539096    .1841848
         77  |   .0340933   .1106841     0.31   0.758    -.1828994     .251086
         78  |   .0537334   .1232232     0.44   0.663    -.1878417    .2953084
         80  |   .0369475   .1473725     0.25   0.802    -.2519716    .3258667
         82  |   .0391687   .1715621     0.23   0.819    -.2971733    .3755108
         83  |    .058766   .1836086     0.32   0.749    -.3011928    .4187249
         85  |   .1042758   .2080199     0.50   0.616    -.3035406    .5120922
         87  |   .1242272   .2327328     0.53   0.594    -.3320379    .5804922
         88  |   .1904977   .2486083     0.77   0.444    -.2968909    .6778863
             |
       _cons |   .3937532   .2469015     1.59   0.111    -.0902893    .8777957
-------------+----------------------------------------------------------------
     sigma_u |  .40275174
     sigma_e |  .30127563
         rho |  .64120306   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(StataNow 18.5)

Comment

Patrick Johannes

Join Date: Dec 2020

Posts: 40
#5

10 Dec 2020, 03:10

Hello Carlo,

i tried to implement

Code:

reghdfe firm_beta_w esg_single_lag1, absorb(industry_key year) vce (cluster company_key year)

it gives me the same coefficients
however the standard errors are different but the second one must be the ritgh one
Could you explain the difference between my fisrt apporach and the second one?
Would appreciate it
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17352

10 Dec 2020, 04:42

Otherwise, you can go OLS without -xtset-ting your data:

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtreg ln_wage c.age##c.age i.year if idcode<=3, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =         39
Group variable: idcode                          Number of groups  =          3

R-sq:                                           Obs per group:
     within  = 0.7404                                         min =         12
     between = 0.4068                                         avg =       13.0
     overall = 0.4014                                         max =         15

                                                F(4,2)            =          .
corr(u_i, Xb)  = -0.8560                        Prob > F          =          .

                                 (Std. Err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0773019   .0101936     7.58   0.017     .0334424    .1211613
             |
 c.age#c.age |  -.0045583   .0021586    -2.11   0.169    -.0138461    .0047294
             |
        year |
         69  |   .3367906   .0871839     3.86   0.061    -.0383313    .7119126
         70  |   .2089384   .2733588     0.76   0.525    -.9672295    1.385106
         71  |   .3144116   .1543689     2.04   0.179    -.3497843    .9786076
         72  |   .5888124   .4728115     1.25   0.339    -1.445531    2.623156
         73  |   .8912873   .4976548     1.79   0.215    -1.249948    3.032523
         75  |   1.246958   .5791178     2.15   0.164    -1.244785    3.738701
         77  |   1.560689   .8225333     1.90   0.198    -1.978387    5.099764
         78  |   1.941522   1.218922     1.59   0.252    -3.303077    7.186121
         80  |    2.34498   1.454951     1.61   0.248    -3.915167    8.605128
         82  |   2.698954   1.585626     1.70   0.231    -4.123442     9.52135
         83  |   2.994437   1.730077     1.73   0.226    -4.449484    10.43836
         85  |   3.538578   2.107946     1.68   0.235    -5.531183    12.60834
         87  |   3.965153      2.346     1.69   0.233     -6.12887    14.05918
         88  |    4.40786   2.563793     1.72   0.228    -6.623251    15.43897
             |
       _cons |   1.465543   .3990418     3.67   0.067    -.2513952    3.182481
-------------+----------------------------------------------------------------
     sigma_u |  .54258328
     sigma_e |  .21942548
         rho |  .85944136   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. reg ln_wage c.age##c.age i.year i.idcode if idcode<=3,  vce(cluster idcode)

Linear regression                               Number of obs     =         39
                                                F(2, 2)           =          .
                                                Prob > F          =          .
                                                R-squared         =     0.8139
                                                Root MSE          =     .21943

                                 (Std. Err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0773019   .0106911     7.23   0.019     .0313017    .1233021
             |
 c.age#c.age |  -.0045583    .002264    -2.01   0.182    -.0142995    .0051828
             |
        year |
         69  |   .3367906   .0914392     3.68   0.066    -.0566405    .7302218
         70  |   .2089384   .2867011     0.73   0.542    -1.024637    1.442514
         71  |   .3144116   .1619035     1.94   0.192     -.382203    1.011026
         72  |   .5888124   .4958888     1.19   0.357    -1.544825     2.72245
         73  |   .8912873   .5219448     1.71   0.230     -1.35446    3.137034
         75  |   1.246958   .6073839     2.05   0.176    -1.366404     3.86032
         77  |   1.560689   .8626802     1.81   0.212    -2.151125    5.272502
         78  |   1.941522   1.278416     1.52   0.268    -3.559059    7.442103
         80  |    2.34498   1.525965     1.54   0.264    -4.220718    8.910678
         82  |   2.698954   1.663018     1.62   0.246    -4.456435    9.854344
         83  |   2.994437    1.81452     1.65   0.241    -4.812813    10.80169
         85  |   3.538578   2.210833     1.60   0.251    -5.973868    13.05102
         87  |   3.965153   2.460506     1.61   0.248    -6.621548    14.55185
         88  |    4.40786   2.688929     1.64   0.243    -7.161667    15.97739
             |
      idcode |
          2  |  -.4183815   .0165036   -25.35   0.002    -.4893909   -.3473721
          3  |   .6579353   .7215294     0.91   0.458    -2.446555    3.762426
             |
       _cons |   1.341224   .1489003     9.01   0.012     .7005575     1.98189
------------------------------------------------------------------------------

.

As you can see, the point estimates for the shared regressors are the same.

Kind regards,
Carlo
(StataNow 18.5)

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17352
#7

10 Dec 2020, 04:46

Patrick:
standard errors were clustered on different units in your first code and in -reghdfe-.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Patrick Johannes

Join Date: Dec 2020

Posts: 40
#8

10 Dec 2020, 05:52

Thanks a lot! This helps!
I still need to cluster by firm and year
Is there any possibility to include that in the linear regression model with the reg command or do i need to use the rehdfe command for this?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17352
#9

10 Dec 2020, 06:42

Patrick:
you should use -reghdfe- for multi-way clustering of the standard errors.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Patrick Johannes

Join Date: Dec 2020

Posts: 40
#10

10 Dec 2020, 07:58

Ok
thank you so much!
Comment

Announcement