Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster standard errors by firm and year

    Hello all,

    I am running a FE regression using the LSDV approach (industry and year FE) and i want to cluster standard errors by firm and year.
    Can i group them? My code is the following:

    Is this the correct way to do it?
    Appreciate your help

    Code:
     xtset id year
    
    
    . egen double_cluster = group (company_key year)
    
    . regress firm_beta_w esg_single_lag1 i.industry_key i.year, vce (cluster double_cluster)

  • #2
    Patrick:
    welcome to this forum.
    Some comments about your post:
    - there's no need to -xtset- your data if you decide to go (pooled) OLS:
    - the community-contributed command -reghdfe- allows multi-way Clustering.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Hi Carlo,

      thx for your fast answer. Is this not the right way to implemet industry and year fixed effects (i.industry and i.year) since i cannot go xtset industry year since industry cannot be assigned to one year thus there are multiple observations.
      I wanted to do thie fixed effects by estimating industry and year dummies.

      Comment


      • #4
        Patrick:
        if you have multiple observations for the same panel per each year (and Stata throw a warning message about that), you can simply -xtset- your dataset with -panelid- only.
        However, this fix comes at cost of ruling out the chance to use time-series related commands, such as lags and leads.
        I do hope that the following example can be helpful:
        Code:
        . use "https://www.stata-press.com/data/r16/nlswork.dta"
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        
        . xtset idcode
               panel variable:  idcode (unbalanced)
        
        . xtreg ln_wage c.age##c.age i.year, fe vce(cluster idcode)
        
        Fixed-effects (within) regression               Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-sq:                                           Obs per group:
             within  = 0.1162                                         min =          1
             between = 0.1078                                         avg =        6.1
             overall = 0.0932                                         max =         15
        
                                                        F(16,4709)        =      79.11
        corr(u_i, Xb)  = 0.0613                         Prob > F          =     0.0000
        
                                     (Std. Err. adjusted for 4,710 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   .0728746    .013687     5.32   0.000     .0460416    .0997075
                     |
         c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
                     |
                year |
                 69  |   .0647054   .0155249     4.17   0.000     .0342693    .0951415
                 70  |   .0284423   .0264639     1.07   0.283    -.0234395     .080324
                 71  |   .0579959   .0384111     1.51   0.131    -.0173078    .1332996
                 72  |   .0510671   .0502675     1.02   0.310    -.0474808     .149615
                 73  |   .0424104   .0624924     0.68   0.497    -.0801038    .1649247
                 75  |   .0151376    .086228     0.18   0.861    -.1539096    .1841848
                 77  |   .0340933   .1106841     0.31   0.758    -.1828994     .251086
                 78  |   .0537334   .1232232     0.44   0.663    -.1878417    .2953084
                 80  |   .0369475   .1473725     0.25   0.802    -.2519716    .3258667
                 82  |   .0391687   .1715621     0.23   0.819    -.2971733    .3755108
                 83  |    .058766   .1836086     0.32   0.749    -.3011928    .4187249
                 85  |   .1042758   .2080199     0.50   0.616    -.3035406    .5120922
                 87  |   .1242272   .2327328     0.53   0.594    -.3320379    .5804922
                 88  |   .1904977   .2486083     0.77   0.444    -.2968909    .6778863
                     |
               _cons |   .3937532   .2469015     1.59   0.111    -.0902893    .8777957
        -------------+----------------------------------------------------------------
             sigma_u |  .40275174
             sigma_e |  .30127563
                 rho |  .64120306   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        .
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Hello Carlo,

          i tried to implement
          Code:
          reghdfe firm_beta_w esg_single_lag1, absorb(industry_key year) vce (cluster company_key year)
          it gives me the same coefficients
          however the standard errors are different but the second one must be the ritgh one
          Could you explain the difference between my fisrt apporach and the second one?
          Would appreciate it

          Comment


          • #6
            Otherwise, you can go OLS without -xtset-ting your data:
            Code:
            . use "https://www.stata-press.com/data/r16/nlswork.dta"
            (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
            
            . xtreg ln_wage c.age##c.age i.year if idcode<=3, fe vce(cluster idcode)
            
            Fixed-effects (within) regression               Number of obs     =         39
            Group variable: idcode                          Number of groups  =          3
            
            R-sq:                                           Obs per group:
                 within  = 0.7404                                         min =         12
                 between = 0.4068                                         avg =       13.0
                 overall = 0.4014                                         max =         15
            
                                                            F(4,2)            =          .
            corr(u_i, Xb)  = -0.8560                        Prob > F          =          .
            
                                             (Std. Err. adjusted for 3 clusters in idcode)
            ------------------------------------------------------------------------------
                         |               Robust
                 ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     age |   .0773019   .0101936     7.58   0.017     .0334424    .1211613
                         |
             c.age#c.age |  -.0045583   .0021586    -2.11   0.169    -.0138461    .0047294
                         |
                    year |
                     69  |   .3367906   .0871839     3.86   0.061    -.0383313    .7119126
                     70  |   .2089384   .2733588     0.76   0.525    -.9672295    1.385106
                     71  |   .3144116   .1543689     2.04   0.179    -.3497843    .9786076
                     72  |   .5888124   .4728115     1.25   0.339    -1.445531    2.623156
                     73  |   .8912873   .4976548     1.79   0.215    -1.249948    3.032523
                     75  |   1.246958   .5791178     2.15   0.164    -1.244785    3.738701
                     77  |   1.560689   .8225333     1.90   0.198    -1.978387    5.099764
                     78  |   1.941522   1.218922     1.59   0.252    -3.303077    7.186121
                     80  |    2.34498   1.454951     1.61   0.248    -3.915167    8.605128
                     82  |   2.698954   1.585626     1.70   0.231    -4.123442     9.52135
                     83  |   2.994437   1.730077     1.73   0.226    -4.449484    10.43836
                     85  |   3.538578   2.107946     1.68   0.235    -5.531183    12.60834
                     87  |   3.965153      2.346     1.69   0.233     -6.12887    14.05918
                     88  |    4.40786   2.563793     1.72   0.228    -6.623251    15.43897
                         |
                   _cons |   1.465543   .3990418     3.67   0.067    -.2513952    3.182481
            -------------+----------------------------------------------------------------
                 sigma_u |  .54258328
                 sigma_e |  .21942548
                     rho |  .85944136   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            
            . reg ln_wage c.age##c.age i.year i.idcode if idcode<=3,  vce(cluster idcode)
            
            Linear regression                               Number of obs     =         39
                                                            F(2, 2)           =          .
                                                            Prob > F          =          .
                                                            R-squared         =     0.8139
                                                            Root MSE          =     .21943
            
                                             (Std. Err. adjusted for 3 clusters in idcode)
            ------------------------------------------------------------------------------
                         |               Robust
                 ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     age |   .0773019   .0106911     7.23   0.019     .0313017    .1233021
                         |
             c.age#c.age |  -.0045583    .002264    -2.01   0.182    -.0142995    .0051828
                         |
                    year |
                     69  |   .3367906   .0914392     3.68   0.066    -.0566405    .7302218
                     70  |   .2089384   .2867011     0.73   0.542    -1.024637    1.442514
                     71  |   .3144116   .1619035     1.94   0.192     -.382203    1.011026
                     72  |   .5888124   .4958888     1.19   0.357    -1.544825     2.72245
                     73  |   .8912873   .5219448     1.71   0.230     -1.35446    3.137034
                     75  |   1.246958   .6073839     2.05   0.176    -1.366404     3.86032
                     77  |   1.560689   .8626802     1.81   0.212    -2.151125    5.272502
                     78  |   1.941522   1.278416     1.52   0.268    -3.559059    7.442103
                     80  |    2.34498   1.525965     1.54   0.264    -4.220718    8.910678
                     82  |   2.698954   1.663018     1.62   0.246    -4.456435    9.854344
                     83  |   2.994437    1.81452     1.65   0.241    -4.812813    10.80169
                     85  |   3.538578   2.210833     1.60   0.251    -5.973868    13.05102
                     87  |   3.965153   2.460506     1.61   0.248    -6.621548    14.55185
                     88  |    4.40786   2.688929     1.64   0.243    -7.161667    15.97739
                         |
                  idcode |
                      2  |  -.4183815   .0165036   -25.35   0.002    -.4893909   -.3473721
                      3  |   .6579353   .7215294     0.91   0.458    -2.446555    3.762426
                         |
                   _cons |   1.341224   .1489003     9.01   0.012     .7005575     1.98189
            ------------------------------------------------------------------------------
            
            .
            As you can see, the point estimates for the shared regressors are the same.
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Patrick:
              standard errors were clustered on different units in your first code and in -reghdfe-.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Thanks a lot! This helps!
                I still need to cluster by firm and year
                Is there any possibility to include that in the linear regression model with the reg command or do i need to use the rehdfe command for this?

                Comment


                • #9
                  Patrick:
                  you should use -reghdfe- for multi-way clustering of the standard errors.
                  Kind regards,
                  Carlo
                  (StataNow 18.5)

                  Comment


                  • #10
                    Ok
                    thank you so much!

                    Comment

                    Working...
                    X