Why does the negative adjusted R squared become positive after clustering by firmn in xtreg, fe?

Jae Li

Join Date: May 2017

Posts: 184
#1

Why does the negative adjusted R squared become positive after clustering by firmn in xtreg, fe?

10 Nov 2022, 04:34

Hello everyone!

I'd like to ask you a question about the intuition behind running --xtreg, fe-- regression analysis. When I didn't cluster by firm id, I obtained a negative adjusted R2 in the OLS regression. However, when I added clustering by firm id, the same regression gave me a positive adjusted R2. Do you possibly know why?

Many thanks for your time in advance!

Best regards,
Jae
Tags: None
Jae Li

Join Date: May 2017

Posts: 184
#2

21 Nov 2022, 10:04

Does anyone have a clue, please?
Comment
George Ford

Join Date: Aug 2014

Posts: 3121
#3

21 Nov 2022, 10:33

Strange. Are the coefficients changed (are the observations identical)?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#4

21 Nov 2022, 11:13

Jae:
without sharing what you typed and what Stata gave you back (as per FAQ), it is really unlike to receive a positive reply.

Kind regards,
Carlo
(Stata 19.0)
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#5

21 Nov 2022, 11:16

I'm a little confused; -xtreg- does not report an "adjusted" R2 so what exactly are you talking about? maybe show the output by copying-and-pasting within a CODE block
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17678

21 Nov 2022, 11:46

Jae:
I can replicate your issue (actually, -xtreg,fe- returns -Adjusted Rsq- via -ereturn list-):

Code:

use "https://www.stata-press.com/data/r17/nlswork.dta"
. xtreg ln_wage age, fe

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1026                                         min =          1
     Between = 0.0877                                         avg =        6.1
     Overall = 0.0774                                         max =         15

                                                F(1,23799)        =    2720.20
corr(u_i, Xb) = 0.0314                          Prob > F          =     0.0000

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0003477    52.16   0.000     .0174534    .0188164
       _cons |   1.148214   .0102579   111.93   0.000     1.128107     1.16832
-------------+----------------------------------------------------------------
     sigma_u |  .40635023
     sigma_e |  .30349389
         rho |  .64192015   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4709, 23799) = 8.81                 Prob > F = 0.0000

. di e(r2_a)
-.07503239

. xtreg ln_wage age, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1026                                         min =          1
     Between = 0.0877                                         avg =        6.1
     Overall = 0.0774                                         max =         15

                                                F(1,4709)         =     884.05
corr(u_i, Xb) = 0.0314                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0006099    29.73   0.000     .0169392    .0193306
       _cons |   1.148214   .0177153    64.81   0.000     1.113483    1.182944
-------------+----------------------------------------------------------------
     sigma_u |  .40635023
     sigma_e |  .30349389
         rho |  .64192015   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. di e(r2_a)
.10254329

.

An old Stata thread (https://www.stata.com/statalist/arch.../msg00201.html) explains how to manually calculate -Adjusted Rsq- after -xtreg,fe-.

Kind regards,
Carlo
(Stata 19.0)

Comment

George Ford

Join Date: Aug 2014

Posts: 3121
#7

21 Nov 2022, 13:20

If you do it manually, you get the same result (the r2 is the same). So Stata is calculating it another way.

If you switch to reghdfe, you get the same r2_a, but its much larger.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10091
#8

21 Nov 2022, 13:38

Originally posted by Jae Li View Post

Hello everyone!

I'd like to ask you a question about the intuition behind running --xtreg, fe-- regression analysis. When I didn't cluster by firm id, I obtained a negative adjusted R2 in the OLS regression. However, when I added clustering by firm id, the same regression gave me a positive adjusted R2. Do you possibly know why?

Many thanks for your time in advance!

Best regards,
Jae

See #9: https://www.statalist.org/forums/for...ted-as-missing. With clustering, observations are not independent within clusters, but are between clusters. So the degrees of freedom change, but as I and others argue in the linked thread, the adjusted within-R2 as calculated is not too useful. There, I propose an alternative way to calculate it.
1 like
Comment

Announcement

Why does the negative adjusted R squared become positive after clustering by firmn in xtreg, fe?

Comment

Comment

Comment

Comment

Comment

Comment

Comment