Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is a Low Within R² in Fixed-Effects Panel Models Acceptable?

    Hello everyone,

    Based on the valuable advice I received from Professor Carlo Lazzaro in a previous discussion, I have refined my research and submitted it to an academic journal. I sincerely appreciate Professor Lazzaro’s guidance, which has been tremendously helpful in improving my work.

    https://www.statalist.org/forums/for...ion-procedures

    https://www.statalist.org/forums/for...ssion-analysis


    The dependent variable is the budget growth rate of government R&D programs, while the independent variables include program evaluation grades (average, excellent, poor), Performance Measurement Appropriateness (PMA), and the interaction term between program evaluation grades and PMA.

    The panel fixed-effects regression results yielded the following within R² values:

    Code:
    . xtreg Gbincreaset dum_Grade2 dum_Grade3 PMA interaction_Grade2_PMA interaction_Grade3_PMA dum_NationalProject2 BProportiont_1 dum_Congress2 dum_Congress3 dum_Scale2 ln_Period ln_realBUDGETt_1 i.YEAR, fe vce(cluster ID)
    Model1 : Overall Program

    Code:
    Fixed-effects (within) regression               Number of obs     =        860
    Group variable: ID                              Number of groups  =         95
    
    R-squared:                                      Obs per group:
         Within  = 0.1701                                         min =          3
         Between = 0.0106                                         avg =        9.1
         Overall = 0.0128                                         max =         11
    
                                                    F(22, 94)         =       6.12
    corr(u_i, Xb) = -0.8797                         Prob > F          =     0.0000
    
                                                  (Std. err. adjusted for 95 clusters in ID)
    ----------------------------------------------------------------------------------------
                           |               Robust
               Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -----------------------+----------------------------------------------------------------
                dum_Grade2 |   .0025008   .0210685     0.12   0.906    -.0393311    .0443328
                dum_Grade3 |  -.1265205   .0442847    -2.86   0.005    -.2144488   -.0385923
                       PMA |   -.400207   .2124456    -1.88   0.063    -.8220226    .0216087
    interaction_Grade2_PMA |   .0676065   .0641698     1.05   0.295    -.0598041    .1950171
    interaction_Grade3_PMA |  -.2909324    .169415    -1.72   0.089    -.6273098     .045445
      dum_NationalProject2 |   .0208537    .025206     0.83   0.410    -.0291934    .0709008
            BProportiont_1 |   .1649261   .0595687     2.77   0.007      .046651    .2832012
             dum_Congress2 |    .021932   .0193291     1.13   0.259    -.0164463    .0603104
             dum_Congress3 |  -.0256428   .0214164    -1.20   0.234    -.0681656      .01688
                dum_Scale2 |   .1186718   .0536993     2.21   0.030     .0120506     .225293
                 ln_Period |  -.0096512    .084944    -0.11   0.910    -.1783096    .1590072
          ln_realBUDGETt_1 |  -.1488883   .0399237    -3.73   0.000    -.2281578   -.0696188
                           |
                      YEAR |
                     2015  |    .064101   .0283541     2.26   0.026     .0078033    .1203987
                     2016  |   .0502911   .0363819     1.38   0.170     -.021946    .1225281
                     2017  |   .0178068   .0331164     0.54   0.592    -.0479466    .0835601
                     2018  |   .0134901   .0392066     0.34   0.732    -.0643555    .0913356
                     2019  |   .0364182   .0485554     0.75   0.455    -.0599896     .132826
                     2020  |    .074119   .0547573     1.35   0.179     -.034603     .182841
                     2021  |   .1087537   .0594412     1.83   0.070    -.0092682    .2267756
                     2022  |   .0338457   .0637374     0.53   0.597    -.0927064    .1603977
                     2023  |   .0092021   .0690746     0.13   0.894    -.1279471    .1463513
                     2024  |   -.079715   .0713125    -1.12   0.266    -.2213076    .0618776
                           |
                     _cons |   .7804078   .2973636     2.62   0.010     .1899854     1.37083
    -----------------------+----------------------------------------------------------------
                   sigma_u |  .24078143
                   sigma_e |   .1931013
                       rho |  .60858051   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------


    Model2(Department : MSIT)

    Code:
    Fixed-effects (within) regression               Number of obs     =        284
    Group variable: ID                              Number of groups  =         29
    
    R-squared:                                      Obs per group:
         Within  = 0.2884                                         min =          6
         Between = 0.1156                                         avg =        9.8
         Overall = 0.0004                                         max =         11
    
                                                    F(22, 28)         =      49.84
    corr(u_i, Xb) = -0.9607                         Prob > F          =     0.0000
    
                                                  (Std. err. adjusted for 29 clusters in ID)
    ----------------------------------------------------------------------------------------
                           |               Robust
               Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -----------------------+----------------------------------------------------------------
                dum_Grade2 |   .0146971   .0341105     0.43   0.670    -.0551751    .0845694
                dum_Grade3 |  -.0262159   .0949068    -0.28   0.784    -.2206236    .1681918
                       PMA |  -.7700591    .598621    -1.29   0.209    -1.996279    .4561604
    interaction_Grade2_PMA |   .0516906     .08433     0.61   0.545    -.1210515    .2244328
    interaction_Grade3_PMA |   -.050318   .3672456    -0.14   0.892    -.8025866    .7019505
      dum_NationalProject2 |    .042069   .0288556     1.46   0.156    -.0170389    .1011769
            BProportiont_1 |   .2932974   .1111382     2.64   0.013     .0656411    .5209536
             dum_Congress2 |   .0003401   .0309072     0.01   0.991    -.0629704    .0636506
             dum_Congress3 |  -.0729838   .0459666    -1.59   0.124    -.1671421    .0211745
                dum_Scale2 |   .1869602   .0808099     2.31   0.028     .0214287    .3524917
                 ln_Period |    -.24902   .2212701    -1.13   0.270    -.7022712    .2042311
          ln_realBUDGETt_1 |  -.3046663   .0614343    -4.96   0.000    -.4305088   -.1788237
                           |
                      YEAR |
                     2015  |   .1142503   .0740903     1.54   0.134    -.0375168    .2660175
                     2016  |   .1994159   .0813761     2.45   0.021     .0327246    .3661073
                     2017  |   .1641194   .1036531     1.58   0.125    -.0482044    .3764432
                     2018  |   .1158698   .1222286     0.95   0.351    -.1345041    .3662437
                     2019  |   .2071422   .1230809     1.68   0.103    -.0449776     .459262
                     2020  |   .2357455   .1617658     1.46   0.156    -.0956167    .5671077
                     2021  |    .307164   .1437909     2.14   0.042     .0126216    .6017063
                     2022  |   .2772618   .1672246     1.66   0.108    -.0652823    .6198059
                     2023  |   .2929786   .1679828     1.74   0.092    -.0511186    .6370759
                     2024  |   .1663899   .1717733     0.97   0.341    -.1854717    .5182515
                           |
                     _cons |   1.988337   .5953076     3.34   0.002     .7689049     3.20777
    -----------------------+----------------------------------------------------------------
                   sigma_u |  .51847783
                   sigma_e |  .20567156
                       rho |  .86403708   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------
    Model 3(Deprtment : MOTIE)

    Code:
    Fixed-effects (within) regression               Number of obs     =        224
    Group variable: ID                              Number of groups  =         24
    
    R-squared:                                      Obs per group:
         Within  = 0.2858                                         min =          5
         Between = 0.0353                                         avg =        9.3
         Overall = 0.0199                                         max =         11
    
                                                    F(22, 23)         =     143.70
    corr(u_i, Xb) = -0.8989                         Prob > F          =     0.0000
    
                                                  (Std. err. adjusted for 24 clusters in ID)
    ----------------------------------------------------------------------------------------
                           |               Robust
               Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -----------------------+----------------------------------------------------------------
                dum_Grade2 |  -.0040038   .0498117    -0.08   0.937    -.1070472    .0990395
                dum_Grade3 |  -.2060176   .0228124    -9.03   0.000    -.2532086   -.1588265
                       PMA |  -1.102502    .362825    -3.04   0.006    -1.853062   -.3519409
    interaction_Grade2_PMA |   .2784458   .2270171     1.23   0.232    -.1911748    .7480664
    interaction_Grade3_PMA |  -.2284704   .1077059    -2.12   0.045     -.451277   -.0056638
      dum_NationalProject2 |   .0794018   .0550323     1.44   0.163    -.0344412    .1932447
            BProportiont_1 |    .018985   .0640657     0.30   0.770    -.1135451    .1515151
             dum_Congress2 |     .04456   .0345698     1.29   0.210    -.0269531    .1160732
             dum_Congress3 |  -.0483268   .0290808    -1.66   0.110     -.108485    .0118315
                dum_Scale2 |   .0149386   .0401202     0.37   0.713    -.0680563    .0979335
                 ln_Period |  -.2137149   .1050459    -2.03   0.054    -.4310189     .003589
          ln_realBUDGETt_1 |  -.0837218   .0510144    -1.64   0.114    -.1892532    .0218096
                           |
                      YEAR |
                     2015  |   .0881138   .0457769     1.92   0.067    -.0065829    .1828106
                     2016  |  -.0021486   .0463278    -0.05   0.963     -.097985    .0936878
                     2017  |    .032101   .0573949     0.56   0.581    -.0866295    .1508314
                     2018  |   .0556201   .0706875     0.79   0.439    -.0906082    .2018483
                     2019  |   .1412439     .07652     1.85   0.078    -.0170498    .2995376
                     2020  |   .2123569   .0820638     2.59   0.016     .0425951    .3821187
                     2021  |    .173264   .0876465     1.98   0.060    -.0080467    .3545746
                     2022  |   .0832085   .0957713     0.87   0.394    -.1149096    .2813265
                     2023  |    .099512   .1021161     0.97   0.340    -.1117313    .3107553
                     2024  |   .0274431   .1289697     0.21   0.833     -.239351    .2942372
                           |
                     _cons |   1.725018   .4687378     3.68   0.001     .7553602    2.694676
    -----------------------+----------------------------------------------------------------
                   sigma_u |  .30347756
                   sigma_e |  .18860531
                       rho |  .72137702   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------
    • Model 1: 0.1701
    • Model 2: 0.2884
    • Model 3: 0.2858
    One of the reviewers raised concerns about whether these within R² values are acceptable.

    I have searched for relevant discussions on Statalist and found multiple threads explaining that the R² in a fixed-effects model refers to the within R². However, I could not find a clear consensus on what range of within R² is generally considered acceptable.

    Could you kindly share insights on what range of within R² is typically acceptable in fixed-effects panel models?Specifically, are my within R² values (0.1701, 0.2884, 0.2858) reasonable in the context of panel data analysis? If there are any recommended references on this topic, I would greatly appreciate your suggestions.

    Thank you very much for your time and help!

  • #2
    Hi Hyunjin,
    A low R squared in the fixed effect context would essentially mean that either you have some misspecification, or that the explanatory variables do not show too much variation across the time dimension for your dataset. While I cannot say how exactly you would expect the PMA or the grades to vary, just looking at a low within R² is not the best way to evaluate your model. Since a reviewer raised the question, I am assuming that the expectation is that your explanatory variables should vary much more across time, but it does not in your dataset. If you have convinced yourself that there is no misspecification, the answer is your dataset is constrained on that end.

    You could actually look at the discussion here for a better understanding of your problem: https://www.statalist.org/forums/for...a-fe-re-models

    Comment


    • #3
      A low within R² in fixed effects regression isn’t necessarily a problem. The within R² only measures variation after removing the fixed effects, so it naturally tends to be lower than regular R² values (and it is not that low really). The fixed effects already account for all time-invariant characteristics, which often explain a large portion of the total variation. The within R² ignores this explained variation. I’d focus on whether your coefficients are statistically significant, their magnitudes/sign are sensible, whether the model is theoretically legit, and if the F is statistically significant. I don’t usually report R2 for 2WFE models.

      Comment


      • #4
        Hyunjin:
        1) stating that a given coefficient of determination is low/high/else without providing a comparative standard makes the question difficult to reply;
        2) that said, I would test whether the regression you reported in your paper is correctly specified or not. To this aim, you cab replicate by hand the procedure detailed in -linktest- entry, Stata .pdf manual, as you ba see in the following toy-example:
        Code:
        . use https://www.stata-press.com/data/r18/nlswork.dta
        (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
        
        . xtreg ln_wage c.age##c.age, fe vce(cluster idcode )
        
        Fixed-effects (within) regression               Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-squared:                                      Obs per group:
             Within  = 0.1087                                         min =          1
             Between = 0.1006                                         avg =        6.1
             Overall = 0.0865                                         max =         15
        
                                                        F(2, 4709)        =     507.42
        corr(u_i, Xb) = 0.0440                          Prob > F          =     0.0000
        
                                     (Std. err. adjusted for 4,710 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
                 age |   .0539076    .004307    12.52   0.000     .0454638    .0623515
                     |
         c.age#c.age |  -.0005973    .000072    -8.30   0.000    -.0007384   -.0004562
                     |
               _cons |    .639913   .0624195    10.25   0.000     .5175415    .7622845
        -------------+----------------------------------------------------------------
             sigma_u |   .4039153
             sigma_e |  .30245467
                 rho |  .64073314   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . predict fitted, xb
        (24 missing values generated)
        
        . g sq_fitted=fitted^2
        (24 missing values generated)
        
        . xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode )
        
        Fixed-effects (within) regression               Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-squared:                                      Obs per group:
             Within  = 0.1092                                         min =          1
             Between = 0.1033                                         avg =        6.1
             Overall = 0.0881                                         max =         15
        
                                                        F(2, 4709)        =     523.09
        corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000
        
                                     (Std. err. adjusted for 4,710 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
              fitted |   2.569185   .7085064     3.63   0.000     1.180181    3.958189
           sq_fitted |    -.47432   .2153021    -2.20   0.028    -.8964128   -.0522272
               _cons |  -1.290258    .580562    -2.22   0.026    -2.428431   -.1520844
        -------------+----------------------------------------------------------------
             sigma_u |    .403403
             sigma_e |  .30238578
                 rho |  .64025357   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . testparm sq_fitted
        
         ( 1)  sq_fitted = 0
        
               F(  1,  4709) =    4.85
                    Prob > F =    0.0276
        
        .
        The outcome of -testparm- clearly show that the regression was misspecified.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Hyunjin:
          1) stating that a given coefficient of determination is low/high/else without providing a comparative standard makes the question difficult to reply;
          2) that said, I would test whether the regression you reported in your paper is correctly specified or not. To this aim, you cab replicate by hand the procedure detailed in -linktest- entry, Stata .pdf manual, as you ba see in the following toy-example:
          Code:
          . use https://www.stata-press.com/data/r18/nlswork.dta
          (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
          
          . xtreg ln_wage c.age##c.age, fe vce(cluster idcode )
          
          Fixed-effects (within) regression Number of obs = 28,510
          Group variable: idcode Number of groups = 4,710
          
          R-squared: Obs per group:
          Within = 0.1087 min = 1
          Between = 0.1006 avg = 6.1
          Overall = 0.0865 max = 15
          
          F(2, 4709) = 507.42
          corr(u_i, Xb) = 0.0440 Prob > F = 0.0000
          
          (Std. err. adjusted for 4,710 clusters in idcode)
          ------------------------------------------------------------------------------
          | Robust
          ln_wage | Coefficient std. err. t P>|t| [95% conf. interval]
          -------------+----------------------------------------------------------------
          age | .0539076 .004307 12.52 0.000 .0454638 .0623515
          |
          c.age#c.age | -.0005973 .000072 -8.30 0.000 -.0007384 -.0004562
          |
          _cons | .639913 .0624195 10.25 0.000 .5175415 .7622845
          -------------+----------------------------------------------------------------
          sigma_u | .4039153
          sigma_e | .30245467
          rho | .64073314 (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          . predict fitted, xb
          (24 missing values generated)
          
          . g sq_fitted=fitted^2
          (24 missing values generated)
          
          . xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode )
          
          Fixed-effects (within) regression Number of obs = 28,510
          Group variable: idcode Number of groups = 4,710
          
          R-squared: Obs per group:
          Within = 0.1092 min = 1
          Between = 0.1033 avg = 6.1
          Overall = 0.0881 max = 15
          
          F(2, 4709) = 523.09
          corr(u_i, Xb) = 0.0467 Prob > F = 0.0000
          
          (Std. err. adjusted for 4,710 clusters in idcode)
          ------------------------------------------------------------------------------
          | Robust
          ln_wage | Coefficient std. err. t P>|t| [95% conf. interval]
          -------------+----------------------------------------------------------------
          fitted | 2.569185 .7085064 3.63 0.000 1.180181 3.958189
          sq_fitted | -.47432 .2153021 -2.20 0.028 -.8964128 -.0522272
          _cons | -1.290258 .580562 -2.22 0.026 -2.428431 -.1520844
          -------------+----------------------------------------------------------------
          sigma_u | .403403
          sigma_e | .30238578
          rho | .64025357 (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          . testparm sq_fitted
          
          ( 1) sq_fitted = 0
          
          F( 1, 4709) = 4.85
          Prob > F = 0.0276
          
          .
          The outcome of -testparm- clearly show that the regression was misspecified.

          Dear Carlo Lazzaro

          Thank you for your detailed response and guidance.

          One of the reviewers of my paper raised concerns about the Within R² values, questioning whether having values closer to 0 rather than 1 could indicate a problem with the model.

          Following your advice, I conducted the tests using the example you provided, and I obtained the following results:

          (Here, insert a brief summary of key results, particularly highlighting the testparm result and linktest findings.)

          Based on these findings, can I conclude that my model is appropriately specified? If there are any additional considerations I should take into account, I would greatly appreciate your insights.

          Thank you again for your time and valuable support.

          Best regards,



          Model 1 : Overall

          Code:
          . xtreg Gbincreaset dum_Grade2 dum_Grade3 PMA interaction_Grade2_PMA interaction_Grade3_PMA dum_Na
          > tionalProject2 BProportiont_1 dum_Congress2 dum_Congress3 dum_Scale2 ln_Period ln_realBUDGETt_1
          > i.YEAR, fe vce(cluster ID)
          
          Fixed-effects (within) regression               Number of obs     =        860
          Group variable: ID                              Number of groups  =         95
          
          R-squared:                                      Obs per group:
               Within  = 0.1701                                         min =          3
               Between = 0.0106                                         avg =        9.1
               Overall = 0.0128                                         max =         11
          
                                                          F(22, 94)         =       6.12
          corr(u_i, Xb) = -0.8797                         Prob > F          =     0.0000
          
                                                        (Std. err. adjusted for 95 clusters in ID)
          ----------------------------------------------------------------------------------------
                                 |               Robust
                     Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -----------------------+----------------------------------------------------------------
                      dum_Grade2 |   .0025008   .0210685     0.12   0.906    -.0393311    .0443328
                      dum_Grade3 |  -.1265205   .0442847    -2.86   0.005    -.2144488   -.0385923
                             PMA |   -.400207   .2124456    -1.88   0.063    -.8220226    .0216087
          interaction_Grade2_PMA |   .0676065   .0641698     1.05   0.295    -.0598041    .1950171
          interaction_Grade3_PMA |  -.2909324    .169415    -1.72   0.089    -.6273098     .045445
            dum_NationalProject2 |   .0208537    .025206     0.83   0.410    -.0291934    .0709008
                  BProportiont_1 |   .1649261   .0595687     2.77   0.007      .046651    .2832012
                   dum_Congress2 |    .021932   .0193291     1.13   0.259    -.0164463    .0603104
                   dum_Congress3 |  -.0256428   .0214164    -1.20   0.234    -.0681656      .01688
                      dum_Scale2 |   .1186718   .0536993     2.21   0.030     .0120506     .225293
                       ln_Period |  -.0096512    .084944    -0.11   0.910    -.1783096    .1590072
                ln_realBUDGETt_1 |  -.1488883   .0399237    -3.73   0.000    -.2281578   -.0696188
                                 |
                            YEAR |
                           2015  |    .064101   .0283541     2.26   0.026     .0078033    .1203987
                           2016  |   .0502911   .0363819     1.38   0.170     -.021946    .1225281
                           2017  |   .0178068   .0331164     0.54   0.592    -.0479466    .0835601
                           2018  |   .0134901   .0392066     0.34   0.732    -.0643555    .0913356
                           2019  |   .0364182   .0485554     0.75   0.455    -.0599896     .132826
                           2020  |    .074119   .0547573     1.35   0.179     -.034603     .182841
                           2021  |   .1087537   .0594412     1.83   0.070    -.0092682    .2267756
                           2022  |   .0338457   .0637374     0.53   0.597    -.0927064    .1603977
                           2023  |   .0092021   .0690746     0.13   0.894    -.1279471    .1463513
                           2024  |   -.079715   .0713125    -1.12   0.266    -.2213076    .0618776
                                 |
                           _cons |   .7804078   .2973636     2.62   0.010     .1899854     1.37083
          -----------------------+----------------------------------------------------------------
                         sigma_u |  .24078143
                         sigma_e |   .1931013
                             rho |  .60858051   (fraction of variance due to u_i)
          ----------------------------------------------------------------------------------------
          
          . estimates store clusterfe
          
          .
          . predict fitted, xb
          (196 missing values generated)
          
          .
          . g sq_fitted=fitted^2
          (196 missing values generated)
          
          .
          . xtreg Gbincreaset fitted sq_fitted , fe vce(cluster ID)
          
          Fixed-effects (within) regression               Number of obs     =        860
          Group variable: ID                              Number of groups  =         95
          
          R-squared:                                      Obs per group:
               Within  = 0.1701                                         min =          3
               Between = 0.0106                                         avg =        9.1
               Overall = 0.0128                                         max =         11
          
                                                          F(2, 94)          =      38.89
          corr(u_i, Xb) = -0.8798                         Prob > F          =     0.0000
          
                                              (Std. err. adjusted for 95 clusters in ID)
          ------------------------------------------------------------------------------
                       |               Robust
           Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                fitted |    .999202   .1149264     8.69   0.000      .771013    1.227391
             sq_fitted |   .0071848   .1870222     0.04   0.969    -.3641521    .3785217
                 _cons |  -.0003585   .0101089    -0.04   0.972    -.0204301     .019713
          -------------+----------------------------------------------------------------
               sigma_u |  .24082712
               sigma_e |  .19055341
                   rho |  .61497978   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          .
          . testparm sq_fitted
          
           ( 1)  sq_fitted = 0
          
                 F(  1,    94) =    0.00
                      Prob > F =    0.9694

          Model2(Department : MSIT)

          Code:
          . xtreg Gbincreaset dum_Grade2 dum_Grade3 PMA interaction_Grade2_PMA interaction_Grade3_PMA dum_NationalProject2 BProportiont_1 dum_Congress2 dum_Congress3 dum_Scale2 ln_Period ln_realBUDGETt_1 i.YEAR, fe vce(cluster
          >  ID)
          
          Fixed-effects (within) regression               Number of obs     =        284
          Group variable: ID                              Number of groups  =         29
          
          R-squared:                                      Obs per group:
               Within  = 0.2884                                         min =          6
               Between = 0.1156                                         avg =        9.8
               Overall = 0.0004                                         max =         11
          
                                                          F(22, 28)         =      49.84
          corr(u_i, Xb) = -0.9607                         Prob > F          =     0.0000
          
                                                        (Std. err. adjusted for 29 clusters in ID)
          ----------------------------------------------------------------------------------------
                                 |               Robust
                     Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -----------------------+----------------------------------------------------------------
                      dum_Grade2 |   .0146971   .0341105     0.43   0.670    -.0551751    .0845694
                      dum_Grade3 |  -.0262159   .0949068    -0.28   0.784    -.2206236    .1681918
                             PMA |  -.7700591    .598621    -1.29   0.209    -1.996279    .4561604
          interaction_Grade2_PMA |   .0516906     .08433     0.61   0.545    -.1210515    .2244328
          interaction_Grade3_PMA |   -.050318   .3672456    -0.14   0.892    -.8025866    .7019505
            dum_NationalProject2 |    .042069   .0288556     1.46   0.156    -.0170389    .1011769
                  BProportiont_1 |   .2932974   .1111382     2.64   0.013     .0656411    .5209536
                   dum_Congress2 |   .0003401   .0309072     0.01   0.991    -.0629704    .0636506
                   dum_Congress3 |  -.0729838   .0459666    -1.59   0.124    -.1671421    .0211745
                      dum_Scale2 |   .1869602   .0808099     2.31   0.028     .0214287    .3524917
                       ln_Period |    -.24902   .2212701    -1.13   0.270    -.7022712    .2042311
                ln_realBUDGETt_1 |  -.3046663   .0614343    -4.96   0.000    -.4305088   -.1788237
                                 |
                            YEAR |
                           2015  |   .1142503   .0740903     1.54   0.134    -.0375168    .2660175
                           2016  |   .1994159   .0813761     2.45   0.021     .0327246    .3661073
                           2017  |   .1641194   .1036531     1.58   0.125    -.0482044    .3764432
                           2018  |   .1158698   .1222286     0.95   0.351    -.1345041    .3662437
                           2019  |   .2071422   .1230809     1.68   0.103    -.0449776     .459262
                           2020  |   .2357455   .1617658     1.46   0.156    -.0956167    .5671077
                           2021  |    .307164   .1437909     2.14   0.042     .0126216    .6017063
                           2022  |   .2772618   .1672246     1.66   0.108    -.0652823    .6198059
                           2023  |   .2929786   .1679828     1.74   0.092    -.0511186    .6370759
                           2024  |   .1663899   .1717733     0.97   0.341    -.1854717    .5182515
                                 |
                           _cons |   1.988337   .5953076     3.34   0.002     .7689049     3.20777
          -----------------------+----------------------------------------------------------------
                         sigma_u |  .51847783
                         sigma_e |  .20567156
                             rho |  .86403708   (fraction of variance due to u_i)
          ----------------------------------------------------------------------------------------
          
          . estimates store clusterfe
          
          . 
          . predict fitted, xb
          (35 missing values generated)
          
          . 
          . g sq_fitted=fitted^2
          (35 missing values generated)
          
          . 
          . xtreg Gbincreaset fitted sq_fitted , fe vce(cluster ID)
          
          Fixed-effects (within) regression               Number of obs     =        284
          Group variable: ID                              Number of groups  =         29
          
          R-squared:                                      Obs per group:
               Within  = 0.2904                                         min =          6
               Between = 0.1139                                         avg =        9.8
               Overall = 0.0006                                         max =         11
          
                                                          F(2, 28)          =      42.90
          corr(u_i, Xb) = -0.9593                         Prob > F          =     0.0000
          
                                              (Std. err. adjusted for 29 clusters in ID)
          ------------------------------------------------------------------------------
                       |               Robust
           Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                fitted |   .9712194   .1146311     8.47   0.000     .7364081    1.206031
             sq_fitted |   .0866743   .0898665     0.96   0.343    -.0974088    .2707575
                 _cons |  -.0197115   .0205498    -0.96   0.346     -.061806    .0223829
          -------------+----------------------------------------------------------------
               sigma_u |  .51438012
               sigma_e |  .19709565
                   rho |  .87197629   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          . 
          . testparm sq_fitted
          
           ( 1)  sq_fitted = 0
          
                 F(  1,    28) =    0.93
                      Prob > F =    0.3431
          
          .


          Model 3(Deprtment : MOTIE)

          Code:
          . xtreg Gbincreaset dum_Grade2 dum_Grade3 PMA interaction_Grade2_PMA interaction_Grade3_PMA dum_NationalProject2 BProportiont_1 dum_Congress2 dum_Congress3 dum_Scale2 ln_Period ln_realBUDGETt_1 i.YEAR, fe vce(cluster
          >  ID)
          
          Fixed-effects (within) regression               Number of obs     =        224
          Group variable: ID                              Number of groups  =         24
          
          R-squared:                                      Obs per group:
               Within  = 0.2858                                         min =          5
               Between = 0.0353                                         avg =        9.3
               Overall = 0.0199                                         max =         11
          
                                                          F(22, 23)         =     143.70
          corr(u_i, Xb) = -0.8989                         Prob > F          =     0.0000
          
                                                        (Std. err. adjusted for 24 clusters in ID)
          ----------------------------------------------------------------------------------------
                                 |               Robust
                     Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -----------------------+----------------------------------------------------------------
                      dum_Grade2 |  -.0040038   .0498117    -0.08   0.937    -.1070472    .0990395
                      dum_Grade3 |  -.2060176   .0228124    -9.03   0.000    -.2532086   -.1588265
                             PMA |  -1.102502    .362825    -3.04   0.006    -1.853062   -.3519409
          interaction_Grade2_PMA |   .2784458   .2270171     1.23   0.232    -.1911748    .7480664
          interaction_Grade3_PMA |  -.2284704   .1077059    -2.12   0.045     -.451277   -.0056638
            dum_NationalProject2 |   .0794018   .0550323     1.44   0.163    -.0344412    .1932447
                  BProportiont_1 |    .018985   .0640657     0.30   0.770    -.1135451    .1515151
                   dum_Congress2 |     .04456   .0345698     1.29   0.210    -.0269531    .1160732
                   dum_Congress3 |  -.0483268   .0290808    -1.66   0.110     -.108485    .0118315
                      dum_Scale2 |   .0149386   .0401202     0.37   0.713    -.0680563    .0979335
                       ln_Period |  -.2137149   .1050459    -2.03   0.054    -.4310189     .003589
                ln_realBUDGETt_1 |  -.0837218   .0510144    -1.64   0.114    -.1892532    .0218096
                                 |
                            YEAR |
                           2015  |   .0881138   .0457769     1.92   0.067    -.0065829    .1828106
                           2016  |  -.0021486   .0463278    -0.05   0.963     -.097985    .0936878
                           2017  |    .032101   .0573949     0.56   0.581    -.0866295    .1508314
                           2018  |   .0556201   .0706875     0.79   0.439    -.0906082    .2018483
                           2019  |   .1412439     .07652     1.85   0.078    -.0170498    .2995376
                           2020  |   .2123569   .0820638     2.59   0.016     .0425951    .3821187
                           2021  |    .173264   .0876465     1.98   0.060    -.0080467    .3545746
                           2022  |   .0832085   .0957713     0.87   0.394    -.1149096    .2813265
                           2023  |    .099512   .1021161     0.97   0.340    -.1117313    .3107553
                           2024  |   .0274431   .1289697     0.21   0.833     -.239351    .2942372
                                 |
                           _cons |   1.725018   .4687378     3.68   0.001     .7553602    2.694676
          -----------------------+----------------------------------------------------------------
                         sigma_u |  .30347756
                         sigma_e |  .18860531
                             rho |  .72137702   (fraction of variance due to u_i)
          ----------------------------------------------------------------------------------------
          
          . estimates store clusterfe
          
          .
          . predict fitted, xb
          (51 missing values generated)
          
          .
          . g sq_fitted=fitted^2
          (51 missing values generated)
          
          .
          . xtreg Gbincreaset fitted sq_fitted , fe vce(cluster ID)
          
          Fixed-effects (within) regression               Number of obs     =        224
          Group variable: ID                              Number of groups  =         24
          
          R-squared:                                      Obs per group:
               Within  = 0.2858                                         min =          5
               Between = 0.0355                                         avg =        9.3
               Overall = 0.0198                                         max =         11
          
                                                          F(2, 23)          =      33.11
          corr(u_i, Xb) = -0.8991                         Prob > F          =     0.0000
          
                                              (Std. err. adjusted for 24 clusters in ID)
          ------------------------------------------------------------------------------
                       |               Robust
           Gbincreaset | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                fitted |   .9991965   .1258851     7.94   0.000     .7387833     1.25961
             sq_fitted |   .0086071   .1315365     0.07   0.948    -.2634969    .2807112
                 _cons |  -.0006925    .010736    -0.06   0.949    -.0229015    .0215165
          -------------+----------------------------------------------------------------
               sigma_u |  .30367183
               sigma_e |  .17882531
                   rho |  .74251402   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          .
          . testparm sq_fitted
          
           ( 1)  sq_fitted = 0
          
                 F(  1,    23) =    0.00
                      Prob > F =    0.9484
          Last edited by Hyunjin Cha; 24 Feb 2025, 08:43.

          Comment


          • #6
            Hyunjin:
            if I had to choose, I would go Model2 (Department : MSIT).
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Originally posted by Rajdeep Chaudhuri View Post
              Hi Hyunjin,
              A low R squared in the fixed effect context would essentially mean that either you have some misspecification, or that the explanatory variables do not show too much variation across the time dimension for your dataset. While I cannot say how exactly you would expect the PMA or the grades to vary, just looking at a low within R² is not the best way to evaluate your model. Since a reviewer raised the question, I am assuming that the expectation is that your explanatory variables should vary much more across time, but it does not in your dataset. If you have convinced yourself that there is no misspecification, the answer is your dataset is constrained on that end.

              You could actually look at the discussion here for a better understanding of your problem: https://www.statalist.org/forums/for...a-fe-re-models
              Dear Rajdeep Chaudhuri,

              Thank you for your response. As you mentioned, I understand that a low within R² is not just a simple issue but could be related to the characteristics of my dataset.

              To further investigate the concern raised by the reviewer regarding the low R², I conducted a linktest in Stata to assess the specification of my model. Specifically, I examined whether my model captures the predictive pattern correctly by including fitted and sq_fitted variables, and the results are as follows:
              1. The coefficient of sq_fitted is not statistically significant in any of the models (p-values: Model 1 = 0.969, Model 2 = 0.343, Model 3 = 0.948).
                • This suggests that the model does not miss any important nonlinear pattern and that the key explanatory variables are appropriately specified.
              2. The coefficient of fitted is close to 1 in all models
                • Model 1: 0.9992 (p < 0.001)
                • Model 2: 0.9712 (p < 0.001)
                • Model 3: 0.9992 (p < 0.001)
                • This indicates that the model is properly explaining the dependent variable.
              Based on these results, I believe that the low within R² in my model is likely due to the limited time variation in the explanatory variables rather than a misspecification issue. Therefore, instead of being a problem with the model itself, this appears to be a structural limitation of the dataset that should be considered.

              Would it be appropriate to explicitly state this as a limitation of my study? Additionally, do you have any suggestions for further checks I could perform to strengthen my argument?

              Thank you again for your insights. I appreciate your time and guidance!

              Best regards,
              Hyunjin

              Comment


              • #8
                Originally posted by Carlo Lazzaro View Post
                Hyunjin:
                if I had to choose, I would go Model2 (Department : MSIT).
                I really appreciate your insights, Carlo Lazzaro. Could you elaborate on why you find Model 2 to be the most appropriate? Is it mainly due to the higher within R², or do you see other strengths in this model compared to the others?

                Comment


                • #9
                  Originally posted by George Ford View Post
                  A low within R² in fixed effects regression isn’t necessarily a problem. The within R² only measures variation after removing the fixed effects, so it naturally tends to be lower than regular R² values (and it is not that low really). The fixed effects already account for all time-invariant characteristics, which often explain a large portion of the total variation. The within R² ignores this explained variation. I’d focus on whether your coefficients are statistically significant, their magnitudes/sign are sensible, whether the model is theoretically legit, and if the F is statistically significant. I don’t usually report R2 for 2WFE models.
                  I appreciate your insight, George Ford. This explanation helps me better address the reviewer's concern. If within R² is not a critical measure for FE models, would it be reasonable to emphasize the significance of the coefficients and model specification rather than reporting R² in my response to the reviewer?

                  Comment


                  • #10
                    Here's a little sketch of the issue.

                    y is determined by x and w, but weaker for w.

                    the r2 is altered by inflating the disturbance.

                    when r2 is lower, the se of the coefficients rise, but the coefficients are correct.

                    so, when a coef is stat significant and r2 is low, that's pretty strong evidence. insignificance not so much (you see the t falls from 10 to 2 for w).

                    Code:
                    clear
                    
                    set seed 123456
                    
                    postfile sims bx1 sx1 bw1 sw1 r21 bx2 sx2 bw2 sw2 r22 using simul, replace
                    
                    forv i = 1/1000    {
                        quietly {
                        drop _all
                    set obs 10000
                    g x = rgamma(5,1)
                    g w = rnormal()
                    
                    g e1 = rnormal()
                    center e1 , replace
                    g e2 = e1*5
                    
                    g y1 = 1 + 0.5*x + 0.1*w + e1
                    
                    g y2 = 1 + 0.5*x + 0.1*w + e2
                    
                    reg y1 x w
                    local bx1 = _b[x]
                    local sx1 = _se[x]
                    local bw1 = _b[w]
                    local sw1 = _se[w]
                    
                    local r21 = e(r2)
                    
                    reg y2 x w
                    local bx2 = _b[x]
                    local sx2 = _se[x]
                    local bw2 = _b[w]
                    local sw2 = _se[w]
                    local r22 = e(r2)
                            
                    post sims  (`bx1') (`sx1') (`bw1') (`sw1') (`r21') (`bx2') (`sx2') (`bw2') (`sw2') (`r22')
                    }
                    }
                    postclose sims
                    use simul, clear
                    collapse (mean) *
                    capture program drop results
                    program results
                    di _col(20) "High R2" _col(30) "Low R2"
                    di "R-sq" _col(20) %5.3f r21[1] _col(30) %5.3f r22[1]
                    di "b_x [0.5]" _col(20) %5.3f bx1[1] _col(30) %5.3f bx1[1]
                    di "se_x"  _col(20) %5.3f sx1[1] _col(30) %5.3f sx2[1]
                    di "b_w [0.1]" _col(20) %5.3f bw1[1] _col(30) %5.3f bw2[1]
                    di "se_w"  _col(20) %5.3f sw1[1] _col(30) %5.3f sw2[1]
                    end
                    results
                    
                    
                                   High R2    Low R2
                    R-sq          0.558      0.048
                    
                    b_x [0.5]    0.500      0.500
                    se_x           0.004     0.022
                    
                    b_w [0.1]    0.100      0.099
                    se_w           0.010     0.050
                    Last edited by George Ford; 24 Feb 2025, 10:12.

                    Comment


                    • #11
                      Hyunjin:
                      it is mainly due to the higher within R².
                      Kind regards,
                      Carlo
                      (StataNow 18.5)

                      Comment


                      • #12
                        in my experience, a within R2 of 0.3 is pretty darn good.

                        Comment


                        • #13
                          Dear Hyunjin,
                          While it is a reasonable argument if you at all want to defend a "low within R²", whether or not it is appropriate depends on the scope of your study and the dataset. As others have already pointed out, whether or not the coefficients are significant, whether their signs theoretically make sense, and your overall F: these are more potent indicators of the viability of your model. If you have explanations for these, your within R² should not be a problem. I am more concerned about why your reviewer chose to point out the R², but that's just a personal ick.

                          Comment

                          Working...
                          X