R-squared in Panel Data (FE/RE models)

Farhad Mammadov

Join Date: May 2017
Posts: 10

R-squared in Panel Data (FE/RE models)

27 Aug 2020, 08:10

Dear all,

I am working with panel data, Fixed and Random effects models, and doing my interpretation part now. However, i am not sure if R squared in panel data should be interpreted in the same way as in cross-sections data.
Please see below the output of my FE model regression

Code:

. xtreg risk age income health chilsize eduyears marstat empstat lifesat, fe vce (cluster id)

Fixed-effects (within) regression               Number of obs      =     15543
Group variable: id                              Number of groups   =      1459

R-sq:  within  = 0.0072                         Obs per group: min =         1
       between = 0.0486                                        avg =      10.7
       overall = 0.0292                                        max =        13

                                                F(8,1458)          =     10.02
corr(u_i, Xb)  = 0.0674                         Prob > F           =    0.0000

                                  (Std. Err. adjusted for 1459 clusters in id)
------------------------------------------------------------------------------
             |               Robust
        risk |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0074099   .0044295    -1.67   0.095    -.0160988    .0012789
      income |   .0000404   .0000157     2.56   0.010     9.46e-06    .0000712
      health |   .0468225   .0244021     1.92   0.055    -.0010445    .0946895
    chilsize |  -.0213291   .0323559    -0.66   0.510    -.0847981      .04214
    eduyears |  -.0166755    .045874    -0.36   0.716    -.1066617    .0733106
     marstat |  -.3777056   .0998787    -3.78   0.000    -.5736269   -.1817844
     empstat |   .1179813   .1050466     1.12   0.262    -.0880773      .32404
     lifesat |   .0817402   .0151808     5.38   0.000     .0519618    .1115187
       _cons |   4.479306   .6160986     7.27   0.000     3.270771     5.68784
-------------+----------------------------------------------------------------
     sigma_u |  1.5979702
     sigma_e |  1.4577463
         rho |  .54579262   (fraction of variance due to u_i)
------------------------------------------------------------------------------

My question is how would you interpret the above within R squared?

Thank you for your kind feedback.

Best

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17674
#2

27 Aug 2020, 09:14

Farhad;
within R-sq is the coefficisnt of determination to take a look at in -xtreg,fe-.
Due to the -fe- machinery, if you have limited within variation in time-varying predictors, the within R-sq will be probably low.
I would also check your model for possible misspecification of the functional form of the regressand.

Last edited by Carlo Lazzaro; 27 Aug 2020, 09:19.

Kind regards,
Carlo
(Stata 19.0)
Comment

Farhad Mammadov

Join Date: May 2017
Posts: 10

27 Aug 2020, 09:24

Dear Carlo, thank you for your response. I am aware that within R squared in my case is low but this mostly due to the limited within variation in time-variant predictors as you said rather than due to model misspecification.
I checked for model misspecification via the below method and assumed no evidence of model misspecification after all.

Code:

. predict yhat, xb
(6050 missing values generated)

. gen yhat2 = yhat^2
(6050 missing values generated)

. gen yhat3 = yhat^3
(6050 missing values generated)

. gen yhat4 = yhat^4
(6050 missing values generated)

. xtreg risk age income health chilsize eduyears marstat empstat lifesat yhat2 ///
> yhat3 yhat4, fe vce (cluster id) 

Fixed-effects (within) regression               Number of obs      =     15543
Group variable: id                              Number of groups   =      1459

R-sq:  within  = 0.0076                         Obs per group: min =         1
       between = 0.0481                                        avg =      10.7
       overall = 0.0289                                        max =        13

                                                F(11,1458)         =      7.58
corr(u_i, Xb)  = 0.0665                         Prob > F           =    0.0000

                                  (Std. Err. adjusted for 1459 clusters in id)
------------------------------------------------------------------------------
             |               Robust
        risk |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -1.862279   1.011286    -1.84   0.066    -3.846011    .1214517
      income |   .0101403   .0055067     1.84   0.066    -.0006616    .0209421
      health |   11.76919   6.390464     1.84   0.066    -.7662966    24.30468
    chilsize |  -5.361616   2.910852    -1.84   0.066    -11.07152    .3482893
    eduyears |  -4.189353   2.276937    -1.84   0.066    -8.655775    .2770684
     marstat |  -94.92931   51.52908    -1.84   0.066    -196.0084     6.14974
     empstat |   29.63328   16.08775     1.84   0.066    -1.924327    61.19089
     lifesat |   20.53954   11.15232     1.84   0.066    -1.336765    42.41584
       yhat2 |  -75.34484   41.27686    -1.83   0.068    -156.3132    5.623544
       yhat3 |   9.990895   5.497914     1.82   0.069    -.7937719    20.77556
       yhat4 |  -.4920347   .2716595    -1.81   0.070     -1.02492    .0408505
       _cons |   816.4166   443.5618     1.84   0.066    -53.67081    1686.504
-------------+----------------------------------------------------------------
     sigma_u |  1.5983123
     sigma_e |  1.4576044
         rho |  .54594701   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test yhat2 yhat3 yhat4

 ( 1)  yhat2 = 0
 ( 2)  yhat3 = 0
 ( 3)  yhat4 = 0

       F(  3,  1458) =    1.31
            Prob > F =    0.2681

In the end, I wanted to know, can I just say that "my within R squared is 0.007" without any further interpretation? Or do I need to explain and justify this low R squared? I am doing it for my term paper by the way and wanted to know the proper way of interpretation. Thanks and best.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17674
#4

27 Aug 2020, 09:49

Farhad:
as the misspecification test rules out misspecification, you can simply state the your limiter within R-sq is due to a limited variation in time-varying predictors across the T dimensione of your panel dataset.
Whether this result is frequent or not in your research field, unfortunately I cannot say.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

R-squared in Panel Data (FE/RE models)

Comment

Comment

Comment