Fixed-Effects Regression with or without Year-Dummies?

Marian Dudeck

Join Date: Feb 2022
Posts: 12

Fixed-Effects Regression with or without Year-Dummies?

15 Feb 2022, 13:04

Hey Statas,
i have trouble understanding why you should use year dummies in a fixed-effect regression when i already declared the timevariable in xtset.
When i use xtset i declare company (Unt_id) and year as panel data. When I declare the timevariable already there, why should i include year dummies in my regression again?

My results from regression with year dummies differ from the results i get from the regression without year dummies.
Should I use the one with the higher R^2?

Thanks in advance!

Code:

xtreg ESGScore AR V FQ_V FQ_AR ANQ IndependentMembers VG CSRCommittee logTotalRevenue, fe vce(cluster Unt_id)

Fixed-effects (within) regression               Number of obs     =        518
Group variable: Unt_id                          Number of groups  =        128

R-squared:                                      Obs per group:
     Within  = 0.3501                                         min =          1
     Between = 0.5492                                         avg =        4.0
     Overall = 0.5204                                         max =          5

                                                F(9,127)          =      14.56
corr(u_i, Xb) = -0.4480                         Prob > F          =     0.0000

                                     (Std. err. adjusted for 128 clusters in Unt_id)
------------------------------------------------------------------------------------
                   |               Robust
          ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
                AR |   .4419116   .3132781     1.41   0.161    -.1780093    1.061832
                 V |   .5223736   .4697566     1.11   0.268      -.40719    1.451937
              FQ_V |   4.000499   3.999181     1.00   0.319    -3.913158    11.91416
             FQ_AR |    23.9788   5.367196     4.47   0.000     13.35809    34.59952
               ANQ |   7.998329   9.642444     0.83   0.408    -11.08233    27.07899
IndependentMembers |   .1809809   .0402595     4.50   0.000     .1013146    .2606473
                VG |   .2437537   .2735775     0.89   0.375    -.2976069    .7851142
      CSRCommittee |   8.454832   1.919122     4.41   0.000     4.657236    12.25243
   logTotalRevenue |   16.13881   4.720446     3.42   0.001     6.797899    25.47972
             _cons |  -129.1692   44.74514    -2.89   0.005    -217.7118   -40.62666
-------------------+----------------------------------------------------------------
           sigma_u |  14.130315
           sigma_e |   6.127202
               rho |  .84173156   (fraction of variance due to u_i)
------------------------------------------------------------------------------------

. 
. 
. xtreg ESGScore AR V FQ_V FQ_AR ANQ IndependentMembers VG CSRCommittee logTotalRevenue i.year, fe vce(cluster Unt_id)

Fixed-effects (within) regression               Number of obs     =        518
Group variable: Unt_id                          Number of groups  =        128

R-squared:                                      Obs per group:
     Within  = 0.5170                                         min =          1
     Between = 0.4853                                         avg =        4.0
     Overall = 0.4958                                         max =          5

                                                F(13,127)         =      28.34
corr(u_i, Xb) = 0.0412                          Prob > F          =     0.0000

                                     (Std. err. adjusted for 128 clusters in Unt_id)
------------------------------------------------------------------------------------
                   |               Robust
          ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
                AR |   .2745461   .2723805     1.01   0.315    -.2644458     .813538
                 V |   .4833979    .355234     1.36   0.176     -.219546    1.186342
              FQ_V |  -1.281977    3.38248    -0.38   0.705    -7.975294     5.41134
             FQ_AR |   .9844655   5.182062     0.19   0.850    -9.269899    11.23883
               ANQ |   13.28432   4.669543     2.84   0.005     4.044135     22.5245
IndependentMembers |   .1150257    .032074     3.59   0.000      .051557    .1784944
                VG |  -.6194623   .4177155    -1.48   0.141    -1.446046    .2071214
      CSRCommittee |   5.000126    1.67506     2.99   0.003     1.685485    8.314767
   logTotalRevenue |    10.1636   3.853459     2.64   0.009     2.538296     17.7889
                   |
              year |
             2017  |   2.130767    .819544     2.60   0.010     .5090374    3.752497
             2018  |     4.4858   1.104202     4.06   0.000     2.300782    6.670817
             2019  |   6.940105   1.266919     5.48   0.000       4.4331     9.44711
             2020  |   10.78061   1.515915     7.11   0.000      7.78089    13.78033
                   |
             _cons |  -62.66249   36.08125    -1.74   0.085    -134.0608    8.735795
-------------------+----------------------------------------------------------------
           sigma_u |  13.417675
           sigma_e |  5.3102312
               rho |  .86458142   (fraction of variance due to u_i)
------------------------------------------------------------------------------------

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29959
#2

15 Feb 2022, 14:36

If you add year indicators ("dummies") to your regression after -xtset panelvar timevar- you are not adding them "again." While -xtset- automatically causes subsequent -xt- estimation commands to include panel indicators (or condition on panel, for non-linear models), it does not cause them to include time variables. So you can include or exclude time variables as you see fit: but if you want them included, you have to do it explicitly in your estimation command, -xtset- does not make Stata do it for you.

Should I use the one with the higher R^2?

No. First of all, with -xtreg- you get 3 different R² statistics, and they disagree about which model gets the higher one. But even if this were just ordinary linear regression without panel structure, the model with more variables will always get the higher R² (unless the additional variables have missing values and cause the sample size to shrink). No, you should choose this best upon whether or not it is reasonable to expect yearly shocks in the outcome variable. If so, you should include year indicators to remove that extraneous source of variance from the outcome. But if there is nothing about the outcome that changes from year to year, then there is no need to include them. It's a substantive question, really. If you are unsure about it in your case, consult somebody in your field.

Last edited by Clyde Schechter; 15 Feb 2022, 14:42.
Comment

Marian Dudeck

Join Date: Feb 2022
Posts: 12

16 Feb 2022, 00:23

Thanks Clyde, I have a further question. What do you mean by "if there is nothing about the outcome that changes from year to year" - the outcome variable esg-score seems to rise from year to year, and i want to know if that has something to do with the overall board composition of german companies (AR = Size of Supervisory Board, V = Size of Management Board, FQ_AR = Percentage of Women in Supervisory Board, etc.). Literature seems to be clear, that women have a significant positive impact on the esg score, what makes me wonder if my regression is "wrong" - because there they don't have any significant impact if I regress with fixed-effects and year dummies.

If I run the regression for each year, I for every year get the result that women have a positive impact (see code)
I want to avoid that none of the CG variables I have chosen have no significant influence since that is not really what current literature says.

Thanks in advance

Code:

. keep if year == 2020
(394 observations deleted)

. reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)

Linear regression                               Number of obs     =        131
                                                F(7, 130)         =      35.44
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5622
                                                Root MSE          =     12.765

                                  (Std. err. adjusted for 131 clusters in Unt_id)
---------------------------------------------------------------------------------
                |               Robust
       ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
----------------+----------------------------------------------------------------
              V |   .6480332   .7697278     0.84   0.401    -.8747811    2.170848
             AR |  -.3449393   .4207435    -0.82   0.414     -1.17733    .4874515
           FQ_V |   10.34147   7.098629     1.46   0.148    -3.702317    24.38526
          FQ_AR |   16.65881    9.45496     1.76   0.080      -2.0467    35.36431
            ANQ |   8.983921   8.290711     1.08   0.281    -7.418259     25.3861
logTotalRevenue |    11.1041   2.493682     4.45   0.000     6.170646    16.03755
   CSRCommittee |   17.76665   3.203929     5.55   0.000     11.42806    24.10524
          _cons |  -65.81481   19.51878    -3.37   0.001    -104.4304   -27.19923
---------------------------------------------------------------------------------

. 
. clear

. import data
. keep if year == 2019
(402 observations deleted)

. reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)

Linear regression                               Number of obs     =        123
                                                F(7, 122)         =      28.09
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5540
                                                Root MSE          =     13.426

                                  (Std. err. adjusted for 123 clusters in Unt_id)
---------------------------------------------------------------------------------
                |               Robust
       ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
----------------+----------------------------------------------------------------
              V |   1.090006   .7970792     1.37   0.174    -.4878914    2.667904
             AR |   .2255095    .438015     0.51   0.608     -.641585    1.092604
           FQ_V |    15.3489    7.48377     2.05   0.042     .5340269    30.16377
          FQ_AR |   42.77137   11.47711     3.73   0.000     20.05128    65.49146
            ANQ |  -9.875031   9.368227    -1.05   0.294    -28.42037    8.670311
logTotalRevenue |    9.57542   2.536419     3.78   0.000     4.554325    14.59652
   CSRCommittee |   12.81309   3.394163     3.78   0.000     6.094005    19.53218
          _cons |  -59.88676   19.88828    -3.01   0.003     -99.2576   -20.51592
---------------------------------------------------------------------------------

. 
. clear

. import data

. keep if year == 2018
(409 observations deleted)

. reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)

Linear regression                               Number of obs     =        116
                                                F(7, 115)         =      31.73
                                                Prob > F          =     0.0000
                                                R-squared         =     0.6017
                                                Root MSE          =     13.186

                                  (Std. err. adjusted for 116 clusters in Unt_id)
---------------------------------------------------------------------------------
                |               Robust
       ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
----------------+----------------------------------------------------------------
              V |   1.358497   .8869254     1.53   0.128    -.3983315    3.115326
             AR |   .0414441   .4269679     0.10   0.923    -.8042971    .8871852
           FQ_V |   18.76238   8.804256     2.13   0.035     1.322839    36.20191
          FQ_AR |   56.32548   11.41053     4.94   0.000     33.72343    78.92754
            ANQ |  -12.62858   8.778121    -1.44   0.153    -30.01635    4.759193
logTotalRevenue |   9.077139   2.650677     3.42   0.001     3.826659    14.32762
   CSRCommittee |   15.99752   3.187191     5.02   0.000      9.68431    22.31073
          _cons |  -60.94069   20.71223    -2.94   0.004    -101.9676   -19.91374
---------------------------------------------------------------------------------

. 
. clear

. import data

. keep if year == 2017
(439 observations deleted)

. reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)

Linear regression                               Number of obs     =         86
                                                F(7, 85)          =      10.49
                                                Prob > F          =     0.0000
                                                R-squared         =     0.4777
                                                Root MSE          =     14.515

                                   (Std. err. adjusted for 86 clusters in Unt_id)
---------------------------------------------------------------------------------
                |               Robust
       ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
----------------+----------------------------------------------------------------
              V |   .0185184   1.028664     0.02   0.986    -2.026742    2.063779
             AR |  -.0647979    .479464    -0.14   0.893    -1.018101     .888505
           FQ_V |   37.95327   15.53597     2.44   0.017     7.063605    68.84293
          FQ_AR |    31.9923   17.65253     1.81   0.073    -3.105653    67.09025
            ANQ |   4.773122    10.8119     0.44   0.660    -16.72384    26.27008
logTotalRevenue |   9.343637   3.617755     2.58   0.012     2.150571     16.5367
   CSRCommittee |   10.65174   4.449292     2.39   0.019     1.805355    19.49812
          _cons |   -51.0281   28.73985    -1.78   0.079    -108.1706     6.11443
---------------------------------------------------------------------------------

. 
. clear

. import data

. keep if year == 2016
(456 observations deleted)

. reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee, vce(cluster Unt_id)

Linear regression                               Number of obs     =         69
                                                F(7, 68)          =      14.21
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5723
                                                Root MSE          =      13.93

                                   (Std. err. adjusted for 69 clusters in Unt_id)
---------------------------------------------------------------------------------
                |               Robust
       ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
----------------+----------------------------------------------------------------
              V |   2.123608   1.126406     1.89   0.064    -.1241004    4.371317
             AR |  -.3180608    .585401    -0.54   0.589     -1.48621    .8500887
           FQ_V |   50.44724   13.03643     3.87   0.000     24.43344    76.46104
          FQ_AR |   17.95735    17.4664     1.03   0.308     -16.8963    52.81101
            ANQ |  -5.034274   13.30385    -0.38   0.706     -31.5817    21.51315
logTotalRevenue |   6.882246   4.296216     1.60   0.114     -1.69072    15.45521
   CSRCommittee |   18.86971   5.100967     3.70   0.000     8.690888    29.04853
          _cons |  -30.97272     34.732    -0.89   0.376    -100.2794    38.33391
---------------------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#4

16 Feb 2022, 01:19

Marian:
the within R_sq in your second -xtreg,fe- code is remarkably higher than its first code counterpart.
This difference makes me think that including -i.year- is the way to go.
In addition, your -i.year- coefficients are clearly statistical significant and if you test their joint statistical significance via -testparm- this result will be confirmed.
In addition, the panel-wise effect and the vector of regressors seem to be poorly correlated (corr(u_i, Xb) = 0.0412). This might be a warning signal to explore -xtreg,re-.
I would also test whether the functional form of the regressand is correctly specified.
Eventually, I would not challenge myself with annual regressions, as they do not take the panel structure of your data into account.

Kind regards,
Carlo
(Stata 19.0)
Comment

Marian Dudeck

Join Date: Feb 2022
Posts: 12

16 Feb 2022, 01:46

Hello Carlo, thanks for the reply.
I tested if I should use the random effects vs the fixed effects model via hausman test got the result to go with fe.
When I run the regression with re the corr(u_i, X) results to 0:

Code:

xtreg ESGScore_neu V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year, re vce(cluster Unt_id)

Random-effects GLS regression                   Number of obs     =        520
Group variable: Unt_id                          Number of groups  =        205

R-squared:                                      Obs per group:
     Within  = 0.4858                                         min =          1
     Between = 0.5504                                         avg =        2.5
     Overall = 0.5518                                         max =          5

                                                Wald chi2(12)     =     607.60
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                     (Std. err. adjusted for 205 clusters in Unt_id)
------------------------------------------------------------------------------------
                   |               Robust
      ESGScore_neu | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------------+----------------------------------------------------------------
                 V |   .6841231   .3508614     1.95   0.051    -.0035526    1.371799
                AR |   .0696171   .2942224     0.24   0.813    -.5070482    .6462825
              FQ_V |   .9653356   4.660118     0.21   0.836    -8.168328      10.099
             FQ_AR |   25.80802   8.125275     3.18   0.001     9.882771    41.73326
               ANQ |   7.529397    5.35747     1.41   0.160    -2.971052    18.02985
   logTotalRevenue |   12.37862   2.070293     5.98   0.000     8.320923    16.43632
      CSRCommittee |    7.84636   1.949793     4.02   0.000     4.024836    11.66788
IndependentMembers |   .1495845    .022799     6.56   0.000     .1048993    .1942698
                   |
              year |
             2017  |   1.929183   1.075316     1.79   0.073    -.1783972    4.036763
             2018  |   3.837856   1.401931     2.74   0.006     1.090121     6.58559
             2019  |   6.423435   1.582039     4.06   0.000     3.322696    9.524175
             2020  |   9.929435   1.753593     5.66   0.000     6.492455    13.36641
                   |
             _cons |  -86.05129   16.61822    -5.18   0.000    -118.6224   -53.48017
-------------------+----------------------------------------------------------------
           sigma_u |  12.484874
           sigma_e |  5.8458886
               rho |  .82017866   (fraction of variance due to u_i)
------------------------------------------------------------------------------------

How can you test the functional form of the regressand to be correctly specified?
Thanks so far!

Last edited by Marian Dudeck; 16 Feb 2022, 01:48.

Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17676

16 Feb 2022, 02:02

Marian:
1) if you imposed non-default standard errors, yoiu cannot use -hausman- to compare -fe- vs. -re- specification, but you should switch to the community-contributed module -xtoverid-. In addition testing via -hausman- with default standard errors and impose non-default counterparts after -hausman- outcome, is not correct;
2) as we know from any decent textbook on panel data econometrics, -re- imposes a furher constraint vs. -fe-, that is that the correlation between the panel-wise effect and the vector of regressors is assumed to be=0 (unfortunately, this assumption may be far from truth). -xtoverid- helpfile covers ths issue too.
3) the misspecification of the funcyional form of the regressand can be extensively considered as a test for model misspecification. It is based on the very same procedure detailed under -linktest- entry, Stata .pdf manual, that unfortunately cannot be invoked after -xtreg-.
In the following toy-example the model is (deliberately) misspecified, as you can see from the statistical significance of the -sq_fitted- term:

Code:

. use "https://www.stata-press.com/data/r17/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage c.age##c.age, re vce(cluster idcode)

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1087                                         min =          1
     Between = 0.1015                                         avg =        6.1
     Overall = 0.0870                                         max =         15

                                                Wald chi2(2)      =    1258.33
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0590339   .0041049    14.38   0.000     .0509884    .0670795
             |
 c.age#c.age |  -.0006758   .0000688    -9.83   0.000    -.0008107    -.000541
             |
       _cons |   .5479714   .0587198     9.33   0.000     .4328826    .6630601
-------------+----------------------------------------------------------------
     sigma_u |   .3654049
     sigma_e |  .30245467
         rho |  .59342665   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects

        ln_wage[idcode,t] = Xb + u[idcode] + e[idcode,t]

        Estimated results:
                         |       Var     SD = sqrt(Var)
                ---------+-----------------------------
                 ln_wage |   .2285836       .4781042
                       e |   .0914788       .3024547
                       u |   .1335207       .3654049

        Test: Var(u) = 0
                             chibar2(01) = 28074.51
                          Prob > chibar2 =   0.0000

. predict fitted, xb
(24 missing values generated)

. g sq_fitted=fitted^2
(24 missing values generated)

. xtreg ln_wage fitted sq_fitted , re vce(cluster idcode)

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1088                                         min =          1
     Between = 0.1045                                         avg =        6.1
     Overall = 0.0887                                         max =         15

                                                Wald chi2(2)      =    1316.74
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   2.805959   .6246598     4.49   0.000     1.581648    4.030269
   sq_fitted |  -.5516341   .1920793    -2.87   0.004    -.9281026   -.1751656
       _cons |  -1.468083   .5055433    -2.90   0.004     -2.45893   -.4772365
-------------+----------------------------------------------------------------
     sigma_u |  .36481589
     sigma_e |  .30242516
         rho |  .59269507   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

           chi2(  1) =    8.25
         Prob > chi2 =    0.0041

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Marian Dudeck

Join Date: Feb 2022
Posts: 12

16 Feb 2022, 02:32

Hey Carlo, thanks for the reply.
I've taken your advise and run the -xtoverid- Test to see if I should use re or fe with non-default standard errors.
The result is still to go with fe, am I correct?

I understand that I can Test model misspecification, but since I'm using the same model as in literature, just with a different database, i'm wondering if this would be the way to go.

Code:

 xtreg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers g1 g2 g3 g4, re vce(cluster Unt_id)

Random-effects GLS regression                   Number of obs     =        520
Group variable: Unt_id                          Number of groups  =        205

R-squared:                                      Obs per group:
     Within  = 0.4838                                         min =          1
     Between = 0.5376                                         avg =        2.5
     Overall = 0.5415                                         max =          5

                                                Wald chi2(12)     =     569.42
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                     (Std. err. adjusted for 205 clusters in Unt_id)
------------------------------------------------------------------------------------
                   |               Robust
          ESGScore | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------------+----------------------------------------------------------------
                 V |   .6271069   .3193894     1.96   0.050     .0011151    1.253099
                AR |   .0321355   .2686603     0.12   0.905    -.4944291       .5587
              FQ_V |   .0107116   .0424773     0.25   0.801    -.0725423    .0939655
             FQ_AR |   .1903748   .0736692     2.58   0.010     .0459858    .3347637
               ANQ |   .0645767   .0487449     1.32   0.185    -.0309616    .1601151
   logTotalRevenue |   11.31811   1.886636     6.00   0.000     7.620371    15.01585
      CSRCommittee |    7.17636   1.776066     4.04   0.000     3.695335    10.65739
IndependentMembers |   .1371363   .0203896     6.73   0.000     .0971735    .1770992
                g1 |  -9.129215   1.598175    -5.71   0.000    -12.26158   -5.996849
                g2 |  -7.395159   1.227468    -6.02   0.000    -9.800952   -4.989366
                g3 |  -5.646207   .9294309    -6.07   0.000    -7.467858   -3.824556
                g4 |  -3.242748   .6728079    -4.82   0.000    -4.561427   -1.924069
             _cons |  -67.84168   15.00505    -4.52   0.000    -97.25105   -38.43231
-------------------+----------------------------------------------------------------
           sigma_u |  11.331823
           sigma_e |  5.3911309
               rho |  .81543493   (fraction of variance due to u_i)
------------------------------------------------------------------------------------

. 
. 
. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re  robust cluster(Unt_id)
Sargan-Hansen statistic  35.968  Chi-sq(12)   P-value = 0.0003

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#8

16 Feb 2022, 02:41

Marian:
yes, as -xtoverid- rejects the null, you should go -fe-.
However, does -xtoverid- recommendations remain the same if you add -i.year- to the set of predictors?
As far as misspecification test is concerned, unless the original model was tested for that, why not testing (at least to finish off the postestimation routine)?

Kind regards,
Carlo
(Stata 19.0)
Comment

Marian Dudeck

Join Date: Feb 2022
Posts: 12

16 Feb 2022, 02:52

When i add i.year i get an error message, which is why i used g1,...,g4 as year dummies and disregarded the year-dummie which was omitted:

Code:

xtreg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year, re vce(cluster Unt_id)

Random-effects GLS regression                   Number of obs     =        520
Group variable: Unt_id                          Number of groups  =        205

R-squared:                                      Obs per group:
     Within  = 0.4838                                         min =          1
     Between = 0.5376                                         avg =        2.5
     Overall = 0.5415                                         max =          5

                                                Wald chi2(12)     =     569.42
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                     (Std. err. adjusted for 205 clusters in Unt_id)
------------------------------------------------------------------------------------
                   |               Robust
          ESGScore | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------------+----------------------------------------------------------------
                 V |   .6271069   .3193894     1.96   0.050     .0011151    1.253099
                AR |   .0321355   .2686603     0.12   0.905    -.4944291       .5587
              FQ_V |   .0107116   .0424773     0.25   0.801    -.0725423    .0939655
             FQ_AR |   .1371363   .0203896     6.73   0.000     .0971735    .1770992
               ANQ |   .0645767   .0487449     1.32   0.185    -.0309616    .1601151
   logTotalRevenue |   11.31811   1.886636     6.00   0.000     7.620371    15.01585
      CSRCommittee |    7.17636   1.776066     4.04   0.000     3.695335    10.65739
IndependentMembers |   .1903748   .0736692     2.58   0.010     .0459858    .3347637
                   |
              year |
             2017  |   1.734056   .9843553     1.76   0.078    -.1952448    3.663357
             2018  |   3.483008   1.277171     2.73   0.006     .9797983    5.986217
             2019  |   5.886467   1.444741     4.07   0.000     3.054827    8.718106
             2020  |   9.129215   1.598175     5.71   0.000     5.996849    12.26158
                   |
             _cons |  -76.97089   15.17976    -5.07   0.000    -106.7227   -47.21912
-------------------+----------------------------------------------------------------
           sigma_u |  11.331823
           sigma_e |  5.3911309
               rho |  .81543493   (fraction of variance due to u_i)
------------------------------------------------------------------------------------

.
.
. xtoverid
2016b:  operator invalid
r(198);

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#10

16 Feb 2022, 02:57

Marian:
being a bit old-fashioned, the glorious -xtoverid- does not support -fvvarlist- notation (my bad I did not warn you about in my previous post).
Try the usual fix:

Code:

xi: xtreg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year, re vce(cluster Unt_id)

Kind regards,
Carlo
(Stata 19.0)
Comment
Marian Dudeck

Join Date: Feb 2022

Posts: 12
#11

16 Feb 2022, 03:04

Thanks Carlo,
unfortunately the P-Value still remains 0,0003 so i have to use fe
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#12

16 Feb 2022, 06:01

Marian:
why unfortunately?
Usually it's -re- that is considered the last resort.

Kind regards,
Carlo
(Stata 19.0)
Comment

Marian Dudeck

Join Date: Feb 2022
Posts: 12

#13

16 Feb 2022, 09:38

Because with re i get better results. Its just frustrating to have no significant variables in a regression which i include in my master thesis.
I researched a bit about fixed effects and came across a book which describes that fixed-effects aren't appropriate if within Variation is low and rather go with pooled ols and clustered standard errors.
When i look in my data i can see that in every (non-control) variable the within variance is much lower than the between variance.
Another alternative would be to lag the variables one period back, since I want to now if women on the management board have an impact on esg performance. They can't make a difference if they were introduced in the same year as the esg score was measured.

Code:

xtsum V AR FQ_AR FQ_V ANQ

Variable         |      Mean   Std. dev.       Min        Max |    Observations
-----------------+--------------------------------------------+----------------
V        overall |  4.367619   1.753976          2          9 |     N =     525
         between |             1.577374          2        8.4 |     n =     132
         within  |              .693527   -.032381   7.617619 | T-bar = 3.97727
                 |                                            |
AR       overall |     11.56   5.484426          3         21 |     N =     525
         between |             5.444044          3         21 |     n =     132
         within  |             .5963204       7.16      15.56 | T-bar = 3.97727
                 |                                            |
FQ_AR    overall |  28.92907   12.39945          0   66.66667 |     N =     525
         between |             12.33955          0   61.11111 |     n =     132
         within  |             5.216747    3.92907   55.59574 | T-bar = 3.97727
                 |                                            |
FQ_V     overall |  9.811565   14.11851          0   66.66667 |     N =     525
         between |             11.90917          0         50 |     n =     132
         within  |             9.142117  -27.68844    63.1449 | T-bar = 3.97727
                 |                                            |
ANQ      overall |  32.41492   23.03148          0   71.42857 |     N =     525
         between |             23.76007          0   54.28571 |     n =     132
         within  |             2.489571   15.74826   69.91492 | T-bar = 3.97727

Comment

Marian Dudeck

Join Date: Feb 2022
Posts: 12

#14

16 Feb 2022, 10:18

Another Alternative could be to include fixed effects for the industry instead for the companies. The results are the following:

Code:

reg ESGScore V AR FQ_V FQ_AR ANQ logTotalRevenue CSRCommittee IndependentMembers i.year i.IndustryDummy, vce(cluster Unt_id)

Linear regression                               Number of obs     =        520
                                                F(20, 127)        =          .
                                                Prob > F          =          .
                                                R-squared         =     0.6312
                                                Root MSE          =     12.022

                                     (Std. err. adjusted for 128 clusters in Unt_id)
------------------------------------------------------------------------------------
                   |               Robust
          ESGScore | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
                 V |   .3158717   .6565524     0.48   0.631     -.983327     1.61507
                AR |   .1797848   .3943581     0.46   0.649    -.6005787    .9601483
              FQ_V |   .1561139   .0611755     2.55   0.012     .0350587    .2771691
             FQ_AR |   .2629835   .0790811     3.33   0.001     .1064962    .4194707
               ANQ |  -.0082478   .0725033    -0.11   0.910    -.1517187    .1352231
   logTotalRevenue |    10.6022   2.496869     4.25   0.000     5.661348    15.54306
      CSRCommittee |   13.82067   2.177718     6.35   0.000     9.511362    18.12998
IndependentMembers |   .0993536   .0305465     3.25   0.001     .0389076    .1597996
                   |
              year |
             2017  |   -.613203   1.104532    -0.56   0.580    -2.798872    1.572466
             2018  |  -1.425286   1.318795    -1.08   0.282    -4.034944    1.184371
             2019  |  -.4137909   1.535592    -0.27   0.788    -3.452451    2.624869
             2020  |   1.819051   1.682908     1.08   0.282     -1.51112    5.149222
                   |
     IndustryDummy |
                2  |  -9.720466   4.605958    -2.11   0.037    -18.83483   -.6061058
                3  |  -.5525488   3.364249    -0.16   0.870     -7.20979    6.104692
                4  |   6.462793    4.18036     1.55   0.125    -1.809385    14.73497
                5  |   7.849962   3.596779     2.18   0.031     .7325862    14.96734
                6  |   2.067934   5.428091     0.38   0.704    -8.673278    12.80915
                7  |  -1.168484   4.438063    -0.26   0.793     -9.95061    7.613642
                8  |   .7608501   4.572634     0.17   0.868    -8.287567    9.809267
                9  |  -7.059257   4.734298    -1.49   0.138    -16.42758    2.309065
               10  |   6.482022   4.473768     1.45   0.150    -2.370756     15.3348

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#15

16 Feb 2022, 11:44

Marian:
there's nothing frustrating in having non-statistical significant coefficients: this is a populari misconception that, admittedly,is hard to die.
Results are results and, provided that all predictors (and interactions) needed to give a fair and true view of the data generating process you're investigating were included in the right-hand side of your regerssion equation, and the regression has been tested for the usual nuisances, there's nothing more you can do.
Non-significant coefficients are as informative as their significant counterparts.
This may also depend on the data on hand.
The best a researcher can do is trying to understand the reasons underlying both non-significant and significant results.
Your -xtreg,fe- results are simply telling you that, when adjusted for the remaining predictors, time plays a relevant role in determining within panel variation in the regressand. Hence, the question I'd pose myself is: why is it so? Is there any theoretical explanations? Did previous researches reported similar results?

In addition, if -fe- is the way to go, -re- estimator is inconsistent; therefore, the coefficients obtained are not better, but simply unreliable.
Lastly, I will avoid your last pooled OLS, as it sounds like an attempt to search for significance "whatever it takes".

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Announcement

Fixed-Effects Regression with or without Year-Dummies?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment