Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • R-squared in Panel Data (FE/RE models)

    Dear all,

    I am working with panel data, Fixed and Random effects models, and doing my interpretation part now. However, i am not sure if R squared in panel data should be interpreted in the same way as in cross-sections data.
    Please see below the output of my FE model regression

    Code:
    . xtreg risk age income health chilsize eduyears marstat empstat lifesat, fe vce (cluster id)
    
    Fixed-effects (within) regression               Number of obs      =     15543
    Group variable: id                              Number of groups   =      1459
    
    R-sq:  within  = 0.0072                         Obs per group: min =         1
           between = 0.0486                                        avg =      10.7
           overall = 0.0292                                        max =        13
    
                                                    F(8,1458)          =     10.02
    corr(u_i, Xb)  = 0.0674                         Prob > F           =    0.0000
    
                                      (Std. Err. adjusted for 1459 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
            risk |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |  -.0074099   .0044295    -1.67   0.095    -.0160988    .0012789
          income |   .0000404   .0000157     2.56   0.010     9.46e-06    .0000712
          health |   .0468225   .0244021     1.92   0.055    -.0010445    .0946895
        chilsize |  -.0213291   .0323559    -0.66   0.510    -.0847981      .04214
        eduyears |  -.0166755    .045874    -0.36   0.716    -.1066617    .0733106
         marstat |  -.3777056   .0998787    -3.78   0.000    -.5736269   -.1817844
         empstat |   .1179813   .1050466     1.12   0.262    -.0880773      .32404
         lifesat |   .0817402   .0151808     5.38   0.000     .0519618    .1115187
           _cons |   4.479306   .6160986     7.27   0.000     3.270771     5.68784
    -------------+----------------------------------------------------------------
         sigma_u |  1.5979702
         sigma_e |  1.4577463
             rho |  .54579262   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    My question is how would you interpret the above within R squared?

    Thank you for your kind feedback.

    Best

  • #2
    Farhad;
    within R-sq is the coefficisnt of determination to take a look at in -xtreg,fe-.
    Due to the -fe- machinery, if you have limited within variation in time-varying predictors, the within R-sq will be probably low.
    I would also check your model for possible misspecification of the functional form of the regressand.
    Last edited by Carlo Lazzaro; 27 Aug 2020, 10:19.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Dear Carlo, thank you for your response. I am aware that within R squared in my case is low but this mostly due to the limited within variation in time-variant predictors as you said rather than due to model misspecification.
      I checked for model misspecification via the below method and assumed no evidence of model misspecification after all.

      Code:
      . predict yhat, xb
      (6050 missing values generated)
      
      . gen yhat2 = yhat^2
      (6050 missing values generated)
      
      . gen yhat3 = yhat^3
      (6050 missing values generated)
      
      . gen yhat4 = yhat^4
      (6050 missing values generated)
      
      . xtreg risk age income health chilsize eduyears marstat empstat lifesat yhat2 ///
      > yhat3 yhat4, fe vce (cluster id) 
      
      Fixed-effects (within) regression               Number of obs      =     15543
      Group variable: id                              Number of groups   =      1459
      
      R-sq:  within  = 0.0076                         Obs per group: min =         1
             between = 0.0481                                        avg =      10.7
             overall = 0.0289                                        max =        13
      
                                                      F(11,1458)         =      7.58
      corr(u_i, Xb)  = 0.0665                         Prob > F           =    0.0000
      
                                        (Std. Err. adjusted for 1459 clusters in id)
      ------------------------------------------------------------------------------
                   |               Robust
              risk |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               age |  -1.862279   1.011286    -1.84   0.066    -3.846011    .1214517
            income |   .0101403   .0055067     1.84   0.066    -.0006616    .0209421
            health |   11.76919   6.390464     1.84   0.066    -.7662966    24.30468
          chilsize |  -5.361616   2.910852    -1.84   0.066    -11.07152    .3482893
          eduyears |  -4.189353   2.276937    -1.84   0.066    -8.655775    .2770684
           marstat |  -94.92931   51.52908    -1.84   0.066    -196.0084     6.14974
           empstat |   29.63328   16.08775     1.84   0.066    -1.924327    61.19089
           lifesat |   20.53954   11.15232     1.84   0.066    -1.336765    42.41584
             yhat2 |  -75.34484   41.27686    -1.83   0.068    -156.3132    5.623544
             yhat3 |   9.990895   5.497914     1.82   0.069    -.7937719    20.77556
             yhat4 |  -.4920347   .2716595    -1.81   0.070     -1.02492    .0408505
             _cons |   816.4166   443.5618     1.84   0.066    -53.67081    1686.504
      -------------+----------------------------------------------------------------
           sigma_u |  1.5983123
           sigma_e |  1.4576044
               rho |  .54594701   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      
      . test yhat2 yhat3 yhat4
      
       ( 1)  yhat2 = 0
       ( 2)  yhat3 = 0
       ( 3)  yhat4 = 0
      
             F(  3,  1458) =    1.31
                  Prob > F =    0.2681
      In the end, I wanted to know, can I just say that "my within R squared is 0.007" without any further interpretation? Or do I need to explain and justify this low R squared? I am doing it for my term paper by the way and wanted to know the proper way of interpretation. Thanks and best.

      Comment


      • #4
        Farhad:
        as the misspecification test rules out misspecification, you can simply state the your limiter within R-sq is due to a limited variation in time-varying predictors across the T dimensione of your panel dataset.
        Whether this result is frequent or not in your research field, unfortunately I cannot say.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment

        Working...
        X