Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed-Effects Regression with or without Year-Dummies?

    Hey Statas,
    i have trouble understanding why you should use year dummies in a fixed-effect regression when i already declared the timevariable in xtset.
    When i use xtset i declare company (Unt_id) and year as panel data. When I declare the timevariable already there, why should i include year dummies in my regression again?

    My results from regression with year dummies differ from the results i get from the regression without year dummies.
    Should I use the one with the higher R^2?

    Thanks in advance!

    Code:
    xtreg ESGScore AR V FQ_V FQ_AR ANQ IndependentMembers VG CSRCommittee logTotalRevenue, fe vce(cluster Unt_id)
    
    Fixed-effects (within) regression               Number of obs     =        518
    Group variable: Unt_id                          Number of groups  =        128
    
    R-squared:                                      Obs per group:
         Within  = 0.3501                                         min =          1
         Between = 0.5492                                         avg =        4.0
         Overall = 0.5204                                         max =          5
    
                                                    F(9,127)          =      14.56
    corr(u_i, Xb) = -0.4480                         Prob > F          =     0.0000
    
                                         (Std. err. adjusted for 128 clusters in Unt_id)
    ------------------------------------------------------------------------------------
                       |               Robust
              ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------------+----------------------------------------------------------------
                    AR |   .4419116   .3132781     1.41   0.161    -.1780093    1.061832
                     V |   .5223736   .4697566     1.11   0.268      -.40719    1.451937
                  FQ_V |   4.000499   3.999181     1.00   0.319    -3.913158    11.91416
                 FQ_AR |    23.9788   5.367196     4.47   0.000     13.35809    34.59952
                   ANQ |   7.998329   9.642444     0.83   0.408    -11.08233    27.07899
    IndependentMembers |   .1809809   .0402595     4.50   0.000     .1013146    .2606473
                    VG |   .2437537   .2735775     0.89   0.375    -.2976069    .7851142
          CSRCommittee |   8.454832   1.919122     4.41   0.000     4.657236    12.25243
       logTotalRevenue |   16.13881   4.720446     3.42   0.001     6.797899    25.47972
                 _cons |  -129.1692   44.74514    -2.89   0.005    -217.7118   -40.62666
    -------------------+----------------------------------------------------------------
               sigma_u |  14.130315
               sigma_e |   6.127202
                   rho |  .84173156   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------
    
    . 
    . 
    . xtreg ESGScore AR V FQ_V FQ_AR ANQ IndependentMembers VG CSRCommittee logTotalRevenue i.year, fe vce(cluster Unt_id)
    
    Fixed-effects (within) regression               Number of obs     =        518
    Group variable: Unt_id                          Number of groups  =        128
    
    R-squared:                                      Obs per group:
         Within  = 0.5170                                         min =          1
         Between = 0.4853                                         avg =        4.0
         Overall = 0.4958                                         max =          5
    
                                                    F(13,127)         =      28.34
    corr(u_i, Xb) = 0.0412                          Prob > F          =     0.0000
    
                                         (Std. err. adjusted for 128 clusters in Unt_id)
    ------------------------------------------------------------------------------------
                       |               Robust
              ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------------+----------------------------------------------------------------
                    AR |   .2745461   .2723805     1.01   0.315    -.2644458     .813538
                     V |   .4833979    .355234     1.36   0.176     -.219546    1.186342
                  FQ_V |  -1.281977    3.38248    -0.38   0.705    -7.975294     5.41134
                 FQ_AR |   .9844655   5.182062     0.19   0.850    -9.269899    11.23883
                   ANQ |   13.28432   4.669543     2.84   0.005     4.044135     22.5245
    IndependentMembers |   .1150257    .032074     3.59   0.000      .051557    .1784944
                    VG |  -.6194623   .4177155    -1.48   0.141    -1.446046    .2071214
          CSRCommittee |   5.000126    1.67506     2.99   0.003     1.685485    8.314767
       logTotalRevenue |    10.1636   3.853459     2.64   0.009     2.538296     17.7889
                       |
                  year |
                 2017  |   2.130767    .819544     2.60   0.010     .5090374    3.752497
                 2018  |     4.4858   1.104202     4.06   0.000     2.300782    6.670817
                 2019  |   6.940105   1.266919     5.48   0.000       4.4331     9.44711
                 2020  |   10.78061   1.515915     7.11   0.000      7.78089    13.78033
                       |
                 _cons |  -62.66249   36.08125    -1.74   0.085    -134.0608    8.735795
    -------------------+----------------------------------------------------------------
               sigma_u |  13.417675
               sigma_e |  5.3102312
                   rho |  .86458142   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------

  • #2
    If you add year indicators ("dummies") to your regression after -xtset panelvar timevar- you are not adding them "again." While -xtset- automatically causes subsequent -xt- estimation commands to include panel indicators (or condition on panel, for non-linear models), it does not cause them to include time variables. So you can include or exclude time variables as you see fit: but if you want them included, you have to do it explicitly in your estimation command, -xtset- does not make Stata do it for you.

    Should I use the one with the higher R^2?
    No. First of all, with -xtreg- you get 3 different R2 statistics, and they disagree about which model gets the higher one. But even if this were just ordinary linear regression without panel structure, the model with more variables will always get the higher R2 (unless the additional variables have missing values and cause the sample size to shrink). No, you should choose this best upon whether or not it is reasonable to expect yearly shocks in the outcome variable. If so, you should include year indicators to remove that extraneous source of variance from the outcome. But if there is nothing about the outcome that changes from year to year, then there is no need to include them. It's a substantive question, really. If you are unsure about it in your case, consult somebody in your field.
    Last edited by Clyde Schechter; 15 Feb 2022, 15:42.

    Comment


    • #3
      Thanks Clyde, I have a further question. What do you mean by "if there is nothing about the outcome that changes from year to year" - the outcome variable esg-score seems to rise from year to year, and i want to know if that has something to do with the overall board composition of german companies (AR = Size of Supervisory Board, V = Size of Management Board, FQ_AR = Percentage of Women in Supervisory Board, etc.). Literature seems to be clear, that women have a significant positive impact on the esg score, what makes me wonder if my regression is "wrong" - because there they don't have any significant impact if I regress with fixed-effects and year dummies.

      If I run the regression for each year, I for every year get the result that women have a positive impact (see code)
      I want to avoid that none of the CG variables I have chosen have no significant influence since that is not really what current literature says.

      Thanks in advance

      Code:
      . keep if year == 2020
      (394 observations deleted)
      
      . reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)
      
      Linear regression                               Number of obs     =        131
                                                      F(7, 130)         =      35.44
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.5622
                                                      Root MSE          =     12.765
      
                                        (Std. err. adjusted for 131 clusters in Unt_id)
      ---------------------------------------------------------------------------------
                      |               Robust
             ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      ----------------+----------------------------------------------------------------
                    V |   .6480332   .7697278     0.84   0.401    -.8747811    2.170848
                   AR |  -.3449393   .4207435    -0.82   0.414     -1.17733    .4874515
                 FQ_V |   10.34147   7.098629     1.46   0.148    -3.702317    24.38526
                FQ_AR |   16.65881    9.45496     1.76   0.080      -2.0467    35.36431
                  ANQ |   8.983921   8.290711     1.08   0.281    -7.418259     25.3861
      logTotalRevenue |    11.1041   2.493682     4.45   0.000     6.170646    16.03755
         CSRCommittee |   17.76665   3.203929     5.55   0.000     11.42806    24.10524
                _cons |  -65.81481   19.51878    -3.37   0.001    -104.4304   -27.19923
      ---------------------------------------------------------------------------------
      
      . 
      . clear
      
      . import data
      . keep if year == 2019
      (402 observations deleted)
      
      . reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)
      
      Linear regression                               Number of obs     =        123
                                                      F(7, 122)         =      28.09
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.5540
                                                      Root MSE          =     13.426
      
                                        (Std. err. adjusted for 123 clusters in Unt_id)
      ---------------------------------------------------------------------------------
                      |               Robust
             ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      ----------------+----------------------------------------------------------------
                    V |   1.090006   .7970792     1.37   0.174    -.4878914    2.667904
                   AR |   .2255095    .438015     0.51   0.608     -.641585    1.092604
                 FQ_V |    15.3489    7.48377     2.05   0.042     .5340269    30.16377
                FQ_AR |   42.77137   11.47711     3.73   0.000     20.05128    65.49146
                  ANQ |  -9.875031   9.368227    -1.05   0.294    -28.42037    8.670311
      logTotalRevenue |    9.57542   2.536419     3.78   0.000     4.554325    14.59652
         CSRCommittee |   12.81309   3.394163     3.78   0.000     6.094005    19.53218
                _cons |  -59.88676   19.88828    -3.01   0.003     -99.2576   -20.51592
      ---------------------------------------------------------------------------------
      
      . 
      . clear
      
      . import data
      
      . keep if year == 2018
      (409 observations deleted)
      
      . reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)
      
      Linear regression                               Number of obs     =        116
                                                      F(7, 115)         =      31.73
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.6017
                                                      Root MSE          =     13.186
      
                                        (Std. err. adjusted for 116 clusters in Unt_id)
      ---------------------------------------------------------------------------------
                      |               Robust
             ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      ----------------+----------------------------------------------------------------
                    V |   1.358497   .8869254     1.53   0.128    -.3983315    3.115326
                   AR |   .0414441   .4269679     0.10   0.923    -.8042971    .8871852
                 FQ_V |   18.76238   8.804256     2.13   0.035     1.322839    36.20191
                FQ_AR |   56.32548   11.41053     4.94   0.000     33.72343    78.92754
                  ANQ |  -12.62858   8.778121    -1.44   0.153    -30.01635    4.759193
      logTotalRevenue |   9.077139   2.650677     3.42   0.001     3.826659    14.32762
         CSRCommittee |   15.99752   3.187191     5.02   0.000      9.68431    22.31073
                _cons |  -60.94069   20.71223    -2.94   0.004    -101.9676   -19.91374
      ---------------------------------------------------------------------------------
      
      . 
      . clear
      
      . import data
      
      . keep if year == 2017
      (439 observations deleted)
      
      . reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)
      
      Linear regression                               Number of obs     =         86
                                                      F(7, 85)          =      10.49
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.4777
                                                      Root MSE          =     14.515
      
                                         (Std. err. adjusted for 86 clusters in Unt_id)
      ---------------------------------------------------------------------------------
                      |               Robust
             ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      ----------------+----------------------------------------------------------------
                    V |   .0185184   1.028664     0.02   0.986    -2.026742    2.063779
                   AR |  -.0647979    .479464    -0.14   0.893    -1.018101     .888505
                 FQ_V |   37.95327   15.53597     2.44   0.017     7.063605    68.84293
                FQ_AR |    31.9923   17.65253     1.81   0.073    -3.105653    67.09025
                  ANQ |   4.773122    10.8119     0.44   0.660    -16.72384    26.27008
      logTotalRevenue |   9.343637   3.617755     2.58   0.012     2.150571     16.5367
         CSRCommittee |   10.65174   4.449292     2.39   0.019     1.805355    19.49812
                _cons |   -51.0281   28.73985    -1.78   0.079    -108.1706     6.11443
      ---------------------------------------------------------------------------------
      
      . 
      . clear
      
      . import data
      
      . keep if year == 2016
      (456 observations deleted)
      
      . reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)
      
      Linear regression                               Number of obs     =         69
                                                      F(7, 68)          =      14.21
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.5723
                                                      Root MSE          =      13.93
      
                                         (Std. err. adjusted for 69 clusters in Unt_id)
      ---------------------------------------------------------------------------------
                      |               Robust
             ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      ----------------+----------------------------------------------------------------
                    V |   2.123608   1.126406     1.89   0.064    -.1241004    4.371317
                   AR |  -.3180608    .585401    -0.54   0.589     -1.48621    .8500887
                 FQ_V |   50.44724   13.03643     3.87   0.000     24.43344    76.46104
                FQ_AR |   17.95735    17.4664     1.03   0.308     -16.8963    52.81101
                  ANQ |  -5.034274   13.30385    -0.38   0.706     -31.5817    21.51315
      logTotalRevenue |   6.882246   4.296216     1.60   0.114     -1.69072    15.45521
         CSRCommittee |   18.86971   5.100967     3.70   0.000     8.690888    29.04853
                _cons |  -30.97272     34.732    -0.89   0.376    -100.2794    38.33391
      ---------------------------------------------------------------------------------

      Comment


      • #4
        Marian:
        the within R_sq in your second -xtreg,fe- code is remarkably higher than its first code counterpart.
        This difference makes me think that including -i.year- is the way to go.
        In addition, your -i.year- coefficients are clearly statistical significant and if you test their joint statistical significance via -testparm- this result will be confirmed.
        In addition, the panel-wise effect and the vector of regressors seem to be poorly correlated (corr(u_i, Xb) = 0.0412). This might be a warning signal to explore -xtreg,re-.
        I would also test whether the functional form of the regressand is correctly specified.
        Eventually, I would not challenge myself with annual regressions, as they do not take the panel structure of your data into account.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Hello Carlo, thanks for the reply.
          I tested if I should use the random effects vs the fixed effects model via hausman test got the result to go with fe.
          When I run the regression with re the corr(u_i, X) results to 0:

          Code:
          xtreg ESGScore_neu V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year, re vce(cluster Unt_id)
          
          Random-effects GLS regression                   Number of obs     =        520
          Group variable: Unt_id                          Number of groups  =        205
          
          R-squared:                                      Obs per group:
               Within  = 0.4858                                         min =          1
               Between = 0.5504                                         avg =        2.5
               Overall = 0.5518                                         max =          5
          
                                                          Wald chi2(12)     =     607.60
          corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
          
                                               (Std. err. adjusted for 205 clusters in Unt_id)
          ------------------------------------------------------------------------------------
                             |               Robust
                ESGScore_neu | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
          -------------------+----------------------------------------------------------------
                           V |   .6841231   .3508614     1.95   0.051    -.0035526    1.371799
                          AR |   .0696171   .2942224     0.24   0.813    -.5070482    .6462825
                        FQ_V |   .9653356   4.660118     0.21   0.836    -8.168328      10.099
                       FQ_AR |   25.80802   8.125275     3.18   0.001     9.882771    41.73326
                         ANQ |   7.529397    5.35747     1.41   0.160    -2.971052    18.02985
             logTotalRevenue |   12.37862   2.070293     5.98   0.000     8.320923    16.43632
                CSRCommittee |    7.84636   1.949793     4.02   0.000     4.024836    11.66788
          IndependentMembers |   .1495845    .022799     6.56   0.000     .1048993    .1942698
                             |
                        year |
                       2017  |   1.929183   1.075316     1.79   0.073    -.1783972    4.036763
                       2018  |   3.837856   1.401931     2.74   0.006     1.090121     6.58559
                       2019  |   6.423435   1.582039     4.06   0.000     3.322696    9.524175
                       2020  |   9.929435   1.753593     5.66   0.000     6.492455    13.36641
                             |
                       _cons |  -86.05129   16.61822    -5.18   0.000    -118.6224   -53.48017
          -------------------+----------------------------------------------------------------
                     sigma_u |  12.484874
                     sigma_e |  5.8458886
                         rho |  .82017866   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------------

          How can you test the functional form of the regressand to be correctly specified?
          Thanks so far!
          Last edited by Marian Dudeck; 16 Feb 2022, 02:48.

          Comment


          • #6
            Marian:
            1) if you imposed non-default standard errors, yoiu cannot use -hausman- to compare -fe- vs. -re- specification, but you should switch to the community-contributed module -xtoverid-. In addition testing via -hausman- with default standard errors and impose non-default counterparts after -hausman- outcome, is not correct;
            2) as we know from any decent textbook on panel data econometrics, -re- imposes a furher constraint vs. -fe-, that is that the correlation between the panel-wise effect and the vector of regressors is assumed to be=0 (unfortunately, this assumption may be far from truth). -xtoverid- helpfile covers ths issue too.
            3) the misspecification of the funcyional form of the regressand can be extensively considered as a test for model misspecification. It is based on the very same procedure detailed under -linktest- entry, Stata .pdf manual, that unfortunately cannot be invoked after -xtreg-.
            In the following toy-example the model is (deliberately) misspecified, as you can see from the statistical significance of the -sq_fitted- term:
            Code:
            . use "https://www.stata-press.com/data/r17/nlswork.dta"
            (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
            
            . xtreg ln_wage c.age##c.age, re vce(cluster idcode)
            
            Random-effects GLS regression                   Number of obs     =     28,510
            Group variable: idcode                          Number of groups  =      4,710
            
            R-squared:                                      Obs per group:
                 Within  = 0.1087                                         min =          1
                 Between = 0.1015                                         avg =        6.1
                 Overall = 0.0870                                         max =         15
            
                                                            Wald chi2(2)      =    1258.33
            corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
            
                                         (Std. err. adjusted for 4,710 clusters in idcode)
            ------------------------------------------------------------------------------
                         |               Robust
                 ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     age |   .0590339   .0041049    14.38   0.000     .0509884    .0670795
                         |
             c.age#c.age |  -.0006758   .0000688    -9.83   0.000    -.0008107    -.000541
                         |
                   _cons |   .5479714   .0587198     9.33   0.000     .4328826    .6630601
            -------------+----------------------------------------------------------------
                 sigma_u |   .3654049
                 sigma_e |  .30245467
                     rho |  .59342665   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            
            . xttest0
            
            Breusch and Pagan Lagrangian multiplier test for random effects
            
                    ln_wage[idcode,t] = Xb + u[idcode] + e[idcode,t]
            
                    Estimated results:
                                     |       Var     SD = sqrt(Var)
                            ---------+-----------------------------
                             ln_wage |   .2285836       .4781042
                                   e |   .0914788       .3024547
                                   u |   .1335207       .3654049
            
                    Test: Var(u) = 0
                                         chibar2(01) = 28074.51
                                      Prob > chibar2 =   0.0000
            
            . predict fitted, xb
            (24 missing values generated)
            
            . g sq_fitted=fitted^2
            (24 missing values generated)
            
            . xtreg ln_wage fitted sq_fitted , re vce(cluster idcode)
            
            Random-effects GLS regression                   Number of obs     =     28,510
            Group variable: idcode                          Number of groups  =      4,710
            
            R-squared:                                      Obs per group:
                 Within  = 0.1088                                         min =          1
                 Between = 0.1045                                         avg =        6.1
                 Overall = 0.0887                                         max =         15
            
                                                            Wald chi2(2)      =    1316.74
            corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
            
                                         (Std. err. adjusted for 4,710 clusters in idcode)
            ------------------------------------------------------------------------------
                         |               Robust
                 ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                  fitted |   2.805959   .6246598     4.49   0.000     1.581648    4.030269
               sq_fitted |  -.5516341   .1920793    -2.87   0.004    -.9281026   -.1751656
                   _cons |  -1.468083   .5055433    -2.90   0.004     -2.45893   -.4772365
            -------------+----------------------------------------------------------------
                 sigma_u |  .36481589
                 sigma_e |  .30242516
                     rho |  .59269507   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            
            . test sq_fitted
            
             ( 1)  sq_fitted = 0
            
                       chi2(  1) =    8.25
                     Prob > chi2 =    0.0041
            
            .
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Hey Carlo, thanks for the reply.
              I've taken your advise and run the -xtoverid- Test to see if I should use re or fe with non-default standard errors.
              The result is still to go with fe, am I correct?

              I understand that I can Test model misspecification, but since I'm using the same model as in literature, just with a different database, i'm wondering if this would be the way to go.

              Code:
               xtreg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers g1 g2 g3 g4, re vce(cluster Unt_id)
              
              Random-effects GLS regression                   Number of obs     =        520
              Group variable: Unt_id                          Number of groups  =        205
              
              R-squared:                                      Obs per group:
                   Within  = 0.4838                                         min =          1
                   Between = 0.5376                                         avg =        2.5
                   Overall = 0.5415                                         max =          5
              
                                                              Wald chi2(12)     =     569.42
              corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
              
                                                   (Std. err. adjusted for 205 clusters in Unt_id)
              ------------------------------------------------------------------------------------
                                 |               Robust
                        ESGScore | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
              -------------------+----------------------------------------------------------------
                               V |   .6271069   .3193894     1.96   0.050     .0011151    1.253099
                              AR |   .0321355   .2686603     0.12   0.905    -.4944291       .5587
                            FQ_V |   .0107116   .0424773     0.25   0.801    -.0725423    .0939655
                           FQ_AR |   .1903748   .0736692     2.58   0.010     .0459858    .3347637
                             ANQ |   .0645767   .0487449     1.32   0.185    -.0309616    .1601151
                 logTotalRevenue |   11.31811   1.886636     6.00   0.000     7.620371    15.01585
                    CSRCommittee |    7.17636   1.776066     4.04   0.000     3.695335    10.65739
              IndependentMembers |   .1371363   .0203896     6.73   0.000     .0971735    .1770992
                              g1 |  -9.129215   1.598175    -5.71   0.000    -12.26158   -5.996849
                              g2 |  -7.395159   1.227468    -6.02   0.000    -9.800952   -4.989366
                              g3 |  -5.646207   .9294309    -6.07   0.000    -7.467858   -3.824556
                              g4 |  -3.242748   .6728079    -4.82   0.000    -4.561427   -1.924069
                           _cons |  -67.84168   15.00505    -4.52   0.000    -97.25105   -38.43231
              -------------------+----------------------------------------------------------------
                         sigma_u |  11.331823
                         sigma_e |  5.3911309
                             rho |  .81543493   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------------
              
              . 
              . 
              . xtoverid
              
              Test of overidentifying restrictions: fixed vs random effects
              Cross-section time-series model: xtreg re  robust cluster(Unt_id)
              Sargan-Hansen statistic  35.968  Chi-sq(12)   P-value = 0.0003

              Comment


              • #8
                Marian:
                yes, as -xtoverid- rejects the null, you should go -fe-.
                However, does -xtoverid- recommendations remain the same if you add -i.year- to the set of predictors?
                As far as misspecification test is concerned, unless the original model was tested for that, why not testing (at least to finish off the postestimation routine)?
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment


                • #9
                  When i add i.year i get an error message, which is why i used g1,...,g4 as year dummies and disregarded the year-dummie which was omitted:

                  Code:
                  xtreg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year, re vce(cluster Unt_id)
                  
                  Random-effects GLS regression                   Number of obs     =        520
                  Group variable: Unt_id                          Number of groups  =        205
                  
                  R-squared:                                      Obs per group:
                       Within  = 0.4838                                         min =          1
                       Between = 0.5376                                         avg =        2.5
                       Overall = 0.5415                                         max =          5
                  
                                                                  Wald chi2(12)     =     569.42
                  corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
                  
                                                       (Std. err. adjusted for 205 clusters in Unt_id)
                  ------------------------------------------------------------------------------------
                                     |               Robust
                            ESGScore | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                  -------------------+----------------------------------------------------------------
                                   V |   .6271069   .3193894     1.96   0.050     .0011151    1.253099
                                  AR |   .0321355   .2686603     0.12   0.905    -.4944291       .5587
                                FQ_V |   .0107116   .0424773     0.25   0.801    -.0725423    .0939655
                               FQ_AR |   .1371363   .0203896     6.73   0.000     .0971735    .1770992
                                 ANQ |   .0645767   .0487449     1.32   0.185    -.0309616    .1601151
                     logTotalRevenue |   11.31811   1.886636     6.00   0.000     7.620371    15.01585
                        CSRCommittee |    7.17636   1.776066     4.04   0.000     3.695335    10.65739
                  IndependentMembers |   .1903748   .0736692     2.58   0.010     .0459858    .3347637
                                     |
                                year |
                               2017  |   1.734056   .9843553     1.76   0.078    -.1952448    3.663357
                               2018  |   3.483008   1.277171     2.73   0.006     .9797983    5.986217
                               2019  |   5.886467   1.444741     4.07   0.000     3.054827    8.718106
                               2020  |   9.129215   1.598175     5.71   0.000     5.996849    12.26158
                                     |
                               _cons |  -76.97089   15.17976    -5.07   0.000    -106.7227   -47.21912
                  -------------------+----------------------------------------------------------------
                             sigma_u |  11.331823
                             sigma_e |  5.3911309
                                 rho |  .81543493   (fraction of variance due to u_i)
                  ------------------------------------------------------------------------------------
                  
                  .
                  .
                  . xtoverid
                  2016b:  operator invalid
                  r(198);

                  Comment


                  • #10
                    Marian:
                    being a bit old-fashioned, the glorious -xtoverid- does not support -fvvarlist- notation (my bad I did not warn you about in my previous post).
                    Try the usual fix:
                    Code:
                    xi: xtreg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year, re vce(cluster Unt_id)
                    Kind regards,
                    Carlo
                    (StataNow 18.5)

                    Comment


                    • #11
                      Thanks Carlo,
                      unfortunately the P-Value still remains 0,0003 so i have to use fe

                      Comment


                      • #12
                        Marian:
                        why unfortunately?
                        Usually it's -re- that is considered the last resort.
                        Kind regards,
                        Carlo
                        (StataNow 18.5)

                        Comment


                        • #13
                          Because with re i get better results. Its just frustrating to have no significant variables in a regression which i include in my master thesis.
                          I researched a bit about fixed effects and came across a book which describes that fixed-effects aren't appropriate if within Variation is low and rather go with pooled ols and clustered standard errors.
                          When i look in my data i can see that in every (non-control) variable the within variance is much lower than the between variance.
                          Another alternative would be to lag the variables one period back, since I want to now if women on the management board have an impact on esg performance. They can't make a difference if they were introduced in the same year as the esg score was measured.

                          Code:
                          xtsum V AR FQ_AR FQ_V ANQ
                          
                          Variable         |      Mean   Std. dev.       Min        Max |    Observations
                          -----------------+--------------------------------------------+----------------
                          V        overall |  4.367619   1.753976          2          9 |     N =     525
                                   between |             1.577374          2        8.4 |     n =     132
                                   within  |              .693527   -.032381   7.617619 | T-bar = 3.97727
                                           |                                            |
                          AR       overall |     11.56   5.484426          3         21 |     N =     525
                                   between |             5.444044          3         21 |     n =     132
                                   within  |             .5963204       7.16      15.56 | T-bar = 3.97727
                                           |                                            |
                          FQ_AR    overall |  28.92907   12.39945          0   66.66667 |     N =     525
                                   between |             12.33955          0   61.11111 |     n =     132
                                   within  |             5.216747    3.92907   55.59574 | T-bar = 3.97727
                                           |                                            |
                          FQ_V     overall |  9.811565   14.11851          0   66.66667 |     N =     525
                                   between |             11.90917          0         50 |     n =     132
                                   within  |             9.142117  -27.68844    63.1449 | T-bar = 3.97727
                                           |                                            |
                          ANQ      overall |  32.41492   23.03148          0   71.42857 |     N =     525
                                   between |             23.76007          0   54.28571 |     n =     132
                                   within  |             2.489571   15.74826   69.91492 | T-bar = 3.97727

                          Comment


                          • #14
                            Another Alternative could be to include fixed effects for the industry instead for the companies. The results are the following:

                            Code:
                            reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year i.IndustryDummy, vce(cluster Unt_id)
                            
                            Linear regression                               Number of obs     =        520
                                                                            F(20, 127)        =          .
                                                                            Prob > F          =          .
                                                                            R-squared         =     0.6312
                                                                            Root MSE          =     12.022
                            
                                                                 (Std. err. adjusted for 128 clusters in Unt_id)
                            ------------------------------------------------------------------------------------
                                               |               Robust
                                      ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                            -------------------+----------------------------------------------------------------
                                             V |   .3158717   .6565524     0.48   0.631     -.983327     1.61507
                                            AR |   .1797848   .3943581     0.46   0.649    -.6005787    .9601483
                                          FQ_V |   .1561139   .0611755     2.55   0.012     .0350587    .2771691
                                         FQ_AR |   .2629835   .0790811     3.33   0.001     .1064962    .4194707
                                           ANQ |  -.0082478   .0725033    -0.11   0.910    -.1517187    .1352231
                               logTotalRevenue |    10.6022   2.496869     4.25   0.000     5.661348    15.54306
                                  CSRCommittee |   13.82067   2.177718     6.35   0.000     9.511362    18.12998
                            IndependentMembers |   .0993536   .0305465     3.25   0.001     .0389076    .1597996
                                               |
                                          year |
                                         2017  |   -.613203   1.104532    -0.56   0.580    -2.798872    1.572466
                                         2018  |  -1.425286   1.318795    -1.08   0.282    -4.034944    1.184371
                                         2019  |  -.4137909   1.535592    -0.27   0.788    -3.452451    2.624869
                                         2020  |   1.819051   1.682908     1.08   0.282     -1.51112    5.149222
                                               |
                                 IndustryDummy |
                                            2  |  -9.720466   4.605958    -2.11   0.037    -18.83483   -.6061058
                                            3  |  -.5525488   3.364249    -0.16   0.870     -7.20979    6.104692
                                            4  |   6.462793    4.18036     1.55   0.125    -1.809385    14.73497
                                            5  |   7.849962   3.596779     2.18   0.031     .7325862    14.96734
                                            6  |   2.067934   5.428091     0.38   0.704    -8.673278    12.80915
                                            7  |  -1.168484   4.438063    -0.26   0.793     -9.95061    7.613642
                                            8  |   .7608501   4.572634     0.17   0.868    -8.287567    9.809267
                                            9  |  -7.059257   4.734298    -1.49   0.138    -16.42758    2.309065
                                           10  |   6.482022   4.473768     1.45   0.150    -2.370756     15.3348

                            Comment


                            • #15
                              Marian:
                              there's nothing frustrating in having non-statistical significant coefficients: this is a populari misconception that, admittedly,is hard to die.
                              Results are results and, provided that all predictors (and interactions) needed to give a fair and true view of the data generating process you're investigating were included in the right-hand side of your regerssion equation, and the regression has been tested for the usual nuisances, there's nothing more you can do.
                              Non-significant coefficients are as informative as their significant counterparts.
                              This may also depend on the data on hand.
                              The best a researcher can do is trying to understand the reasons underlying both non-significant and significant results.
                              Your -xtreg,fe- results are simply telling you that, when adjusted for the remaining predictors, time plays a relevant role in determining within panel variation in the regressand. Hence, the question I'd pose myself is: why is it so? Is there any theoretical explanations? Did previous researches reported similar results?

                              In addition, if -fe- is the way to go, -re- estimator is inconsistent; therefore, the coefficients obtained are not better, but simply unreliable.
                              Lastly, I will avoid your last pooled OLS, as it sounds like an attempt to search for significance "whatever it takes".
                              Kind regards,
                              Carlo
                              (StataNow 18.5)

                              Comment

                              Working...
                              X