Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effects vs Pooled OLS

    Hello,

    I am performing my regression analysis. I tried to use fixed effects model with these command (xtreg return_outliers esg_score_w EPS_1 EPS_2 size_w, fe) and I obtained the results that I want (a positive and statistically significant relationship between the main two variables). However, using this command I am only controlling for id effects because I used "xtset id year", right?
    I want to control also for year, country and industry effects, should am I use fixed effects and control for this variables or should I use pooled OLS with dummies?
    If I should use fixed effects, what is the right command to perform this controlling effects?

    I also do the testparm command to test if the year, country and industry effects are jointly equal to zero, and I obtained the following results:

    reg return_outliers esg_score_w EPS_1 EPS_2 size_w i.country
    testparm i.country
    F( 18, 3211) = 1.47
    Prob > F = 0.0893

    reg return_outliers esg_score_w EPS_1 EPS_2 size_w i.year
    testparm i.year
    F( 9, 3220) = 90.02
    Prob > F = 0.0000

    reg return_outliers esg_score_w EPS_1 EPS_2 size_w i.ec_sector
    testparm i.ec_sector
    F( 8, 3221) = 2.70
    Prob > F = 0.0058


    Thank you in advance,
    Rita




  • #2
    Code:
    ssc install reghdfe, replace
    ssc install ftools, replace

    This looks like firm-level data. If it is a panel of firms, you should probably also control for firm fixed effects and cluster on the firm identifier.

    Code:
    reghdfe return_outliers esg_score_w EPS_1 EPS_2 size_w, absorb(firm country year sector) cluster(firm)
    Last edited by Andrew Musau; 24 Aug 2021, 09:49.

    Comment


    • #3
      Thank you for your answer. I tried the command that you sent but using that I get a negative relation with the return variable and the ESG score variable, which is not expected to happen. What could be the reason for this?

      Comment


      • #4
        You have to check that the model is properly specified. If this is the case and you have good data, then that is how it is. We do get unexpected results ever so often. Discuss it with your supervisor or peers, otherwise I am not in tune with the governance literature (assuming that ESG is Environmental, Social and Governance).

        Comment


        • #5
          Rita:
          just exploiting Andrew's excellent assist, you can test whether your model is (or not) misspecified using the following approach:
          Code:
          . use "https://www.stata-press.com/data/r16/nlswork.dta"
          (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
          
          . reghdfe ln_wage c.age##c.age, absorb( idcode year) cluster( idcode )
          (dropped 551 singleton observations)
          (converged in 9 iterations)
          
          HDFE Linear regression                            Number of obs   =     27,959
          Absorbing 2 HDFE groups                           F(   2,   4158) =      44.91
          Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                            R-squared       =     0.6593
                                                            Adj R-squared   =     0.5995
                                                            Within R-sq.    =     0.0115
          Number of clusters (idcode)  =      4,159         Root MSE        =     0.3013
          
                                       (Std. Err. adjusted for 4,159 clusters in idcode)
          ------------------------------------------------------------------------------
                       |               Robust
               ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |   .0728746    .013687     5.32   0.000     .0460407    .0997085
                       |
           c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
          ------------------------------------------------------------------------------
          
          Absorbed degrees of freedom:
          ---------------------------------------------------------------+
           Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
          -------------+-------------------------------------------------|
                idcode |            0            4159           4159 *   |
                  year |           14              15              1     |
          ---------------------------------------------------------------+
          * = fixed effect nested within cluster; treated as redundant for DoF computation
          
          . predict fitted, xb
          (24 missing values generated)
          
          . g sq_fitted=fitted^2
          (24 missing values generated)
          
          . reghdfe ln_wage c.age##c.age fitted sq_fitted , absorb( idcode year) cluster( idcode )
          (dropped 551 singleton observations)
          (converged in 9 iterations)
          note: fitted omitted because of collinearity
          
          HDFE Linear regression                            Number of obs   =     27,959
          Absorbing 2 HDFE groups                           F(   3,   4158) =      37.34
          Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                            R-squared       =     0.6598
                                                            Adj R-squared   =     0.6001
                                                            Within R-sq.    =     0.0129
          Number of clusters (idcode)  =      4,159         Root MSE        =     0.3011
          
                                       (Std. Err. adjusted for 4,159 clusters in idcode)
          ------------------------------------------------------------------------------
                       |               Robust
               ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |   .3610546   .0685459     5.27   0.000     .2266681    .4954412
                       |
           c.age#c.age |  -.0050048   .0009355    -5.35   0.000    -.0068389   -.0031708
                       |
                fitted |          0  (omitted)
             sq_fitted |  -1.730591   .4047613    -4.28   0.000     -2.52414   -.9370428
          ------------------------------------------------------------------------------
          
          Absorbed degrees of freedom:
          ---------------------------------------------------------------+
           Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
          -------------+-------------------------------------------------|
                idcode |            0            4159           4159 *   |
                  year |           14              15              1     |
          ---------------------------------------------------------------+
          * = fixed effect nested within cluster; treated as redundant for DoF computation
          
          . test sq_fitted
          
           ( 1)  sq_fitted = 0
          
                 F(  1,  4158) =   18.28
                      Prob > F =    0.0000
          
          .
          *As -test- outcome rejects the null that squared fitted values are not informative, the model is misspecified*
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            note that the literature on "wrong" signs is small but here are the cites I know of:

            Mullett, GM (1976), "Why regression coefficients have the wrong sign", Journal of Quality Technology, 8(3):121-126

            Kennedy, PE (2005), "Oh No! I got the wrong sign! what should I do?", The Journal of Economic Education, 36(1): 77-92 (I love the title!)

            Schuit, E, et al. (2013), "Unexpected predictor-outcome associations in clinical prediction research: causes and solutions," CMAJ, doi:10.1503/cmaj/120812

            Carlo Lazzaro gives a test that, while fine as far as it goes, can miss quite a lot as there are other forms of mis-specification including the absence of variables that, if we knew more, would or might be important

            Comment


            • #7
              Many thanks to Rich Goldstein for providing such intetesting references.
              In addition, I share his point about the test I suggested to check model misspecification. Obviously it has a diagnostic value only. Once we detect that tjhe model is misspecified, we should identify the cause(s) of misspecification, which may well be the toughest part of the regression postestimation check, as misspecification may be due to an imperfect knowledge of the data generating process (i.e. missing predictors and/or interactions) or to even more demanding issues, such as endogeneity.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Thank you
                Last edited by Rita Santos; 25 Aug 2021, 04:41.

                Comment


                • #9
                  Rita:
                  yes, -reghdfe- does not seem to be misspecified.
                  However, the within R_sq is really low (0.0620).
                  What if, going back to -xtreg,fe- you code:
                  Code:
                  xtreg return_outliers esg_score_w EPS_1 EPS_2 size_w i.time, fe
                  Kind regards,
                  Carlo
                  (StataNow 18.5)

                  Comment


                  • #10
                    Good morning,

                    Thank you all for your answers and help.
                    I did what Carlo suggest, following the approach to test if the model is misspecified, or not.
                    I get the following results:

                    Code:
                    . reghdfe return_outliers esg_score_w EPS_1 EPS_2 size_w fitted sq_fitted, absorb(id year country ec_sector) cluster(id year)
                    variable fitted not found
                    r(111);
                    
                    . reghdfe return_outliers esg_score_w EPS_1 EPS_2 size_w, absorb(id year country ec_sector) cluster(id year)
                    (dropped 3 singleton observations)
                    (MWFE estimator converged in 5 iterations)
                    Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
                    
                    HDFE Linear regression                            Number of obs   =      3,317
                    Absorbing 4 HDFE groups                           F(   4,      8) =      13.19
                    Statistics robust to heteroskedasticity           Prob > F        =     0.0013
                                                                      R-squared       =     0.3438
                                                                      Adj R-squared   =     0.2434
                    Number of clusters (id)      =        401         Within R-sq.    =     0.0618
                    Number of clusters (year)    =          9         Root MSE        =     0.1861
                    
                                                    (Std. Err. adjusted for 9 clusters in id year)
                    ------------------------------------------------------------------------------
                                 |               Robust
                    return_out~s |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                     esg_score_w |  -.0006068   .0006725    -0.90   0.393    -.0021577    .0009441
                           EPS_1 |   1.227171   .2517041     4.88   0.001     .6467401    1.807602
                           EPS_2 |   .2977118   .1803194     1.65   0.137    -.1181055    .7135292
                          size_w |   .5491504   .3484121     1.58   0.154    -.2542893     1.35259
                           _cons |  -12.68179   8.091267    -1.57   0.156    -31.34029    5.976706
                    ------------------------------------------------------------------------------
                    
                    Absorbed degrees of freedom:
                    -----------------------------------------------------+
                     Absorbed FE | Categories  - Redundant  = Num. Coefs |
                    -------------+---------------------------------------|
                              id |       401         401           0    *|
                            year |         9           9           0    *|
                         country |        19           0          19     |
                       ec_sector |         9           1           8     |
                    -----------------------------------------------------+
                    * = FE nested within cluster; treated as redundant for DoF computation
                    
                    . predict fitted, xb
                    
                    . g sq_fitted=fitted^2
                    
                    . reghdfe return_outliers esg_score_w EPS_1 EPS_2 size_w fitted sq_fitted, absorb(id year country ec_sector) cluster(id)
                    (dropped 3 singleton observations)
                    (MWFE estimator converged in 5 iterations)
                    note: fitted omitted because of collinearity
                    
                    HDFE Linear regression                            Number of obs   =      3,317
                    Absorbing 4 HDFE groups                           F(   5,    400) =       9.75
                    Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                      R-squared       =     0.3440
                                                                      Adj R-squared   =     0.2437
                                                                      Within R-sq.    =     0.0620
                    Number of clusters (id)      =        401         Root MSE        =     0.1861
                    
                                                       (Std. Err. adjusted for 401 clusters in id)
                    ------------------------------------------------------------------------------
                                 |               Robust
                    return_out~s |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                     esg_score_w |  -.0005821   .0005811    -1.00   0.317    -.0017244    .0005602
                           EPS_1 |   1.247776   .2793813     4.47   0.000     .6985371    1.797016
                           EPS_2 |   .3021157   .1756475     1.72   0.086    -.0431919    .6474234
                          size_w |   .5424021   .2107533     2.57   0.010     .1280796    .9567246
                          fitted |          0  (omitted)
                       sq_fitted |  -.0637757   .1044684    -0.61   0.542    -.2691514    .1416001
                           _cons |  -12.50807   4.891628    -2.56   0.011    -22.12458   -2.891561
                    ------------------------------------------------------------------------------
                    
                    Absorbed degrees of freedom:
                    -----------------------------------------------------+
                     Absorbed FE | Categories  - Redundant  = Num. Coefs |
                    -------------+---------------------------------------|
                              id |       401         401           0    *|
                            year |         9           0           9     |
                         country |        19           1          18     |
                       ec_sector |         9           1           8    ?|
                    -----------------------------------------------------+
                    ? = number of redundant parameters may be higher
                    * = FE nested within cluster; treated as redundant for DoF computation
                    
                    . test sq_fitted
                    
                     ( 1)  sq_fitted = 0
                    
                           F(  1,   400) =    0.37
                                Prob > F =    0.5419

                    This means that we cannot reject the null hypothesis and the model is not misspecified, I'm right?
                    What should I do next? And what could be the reason for the fixed effects model give the right signal and this model not?

                    Thank you again,
                    Rita

                    Comment


                    • #11
                      Carlo,
                      Using that code, I got the following results:

                      Code:
                      xtreg return_outliers esg_score_w EPS_1 EPS_2 size_w i.year, fe
                      
                      Fixed-effects (within) regression               Number of obs     =      3,320
                      Group variable: id                              Number of groups  =        403
                      
                      R-sq:                                           Obs per group:
                           within  = 0.2556                                         min =          1
                           between = 0.0011                                         avg =        8.2
                           overall = 0.0110                                         max =         10
                      
                                                                      F(13,2904)        =      76.70
                      corr(u_i, Xb)  = -0.9743                        Prob > F          =     0.0000
                      
                      ------------------------------------------------------------------------------
                      return_out~s |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                       esg_score_w |  -.0006068   .0005371    -1.13   0.259    -.0016599    .0004464
                             EPS_1 |   1.227171   .1270039     9.66   0.000     .9781441    1.476198
                             EPS_2 |   .2977118    .101458     2.93   0.003     .0987749    .4966488
                            size_w |   .5491504   .1986232     2.76   0.006     .1596937     .938607
                                   |
                              year |
                             2011  |   .0914974   .1954676     0.47   0.640    -.2917718    .4747667
                             2012  |   .3622928   .1954448     1.85   0.064    -.0209317    .7455173
                             2013  |   .3965106   .1954389     2.03   0.043     .0132977    .7797235
                             2014  |   .2809763   .1954555     1.44   0.151    -.1022692    .6642217
                             2015  |   .3248593   .1954894     1.66   0.097    -.0584527    .7081713
                             2016  |   .2998047   .1954768     1.53   0.125    -.0834825    .6830919
                             2017  |   .3646048   .1954936     1.87   0.062    -.0187154     .747925
                             2018  |   .2846121   .1955118     1.46   0.146    -.0987439     .667968
                             2019  |   .4474344   .1955605     2.29   0.022     .0639831    .8308857
                                   |
                             _cons |   -13.0021   4.613409    -2.82   0.005    -22.04799   -3.956219
                      -------------+----------------------------------------------------------------
                           sigma_u |  .54833798
                           sigma_e |  .18518029
                               rho |  .89762629   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      F test that all u_i=0: F(402, 2904) = 1.26                   Prob > F = 0.0007
                      
                      .

                      Comment


                      • #12
                        Rita:
                        what does
                        Code:
                        testparm i.year
                        give you back?
                        Kind regards,
                        Carlo
                        (StataNow 18.5)

                        Comment


                        • #13
                          Carlo,

                          This is the result:

                          Code:
                          testparm i.year
                          
                           ( 1)  2011.year = 0
                           ( 2)  2012.year = 0
                           ( 3)  2013.year = 0
                           ( 4)  2014.year = 0
                           ( 5)  2015.year = 0
                           ( 6)  2016.year = 0
                           ( 7)  2017.year = 0
                           ( 8)  2018.year = 0
                           ( 9)  2019.year = 0
                          
                                 F(  9,  2904) =   82.24
                                      Prob > F =    0.0000

                          Comment


                          • #14
                            Rita:
                            provided that you've already checked that default standard errors are Ok for your regression (ie, you did not spot heteroskedasticity and/or autocorrelation issues), I would stick with the code in #11.
                            Kind regards,
                            Carlo
                            (StataNow 18.5)

                            Comment


                            • #15
                              And why not this code?

                              Code:
                              reg return_outliers esg_score_w eps_1_w eps_2_w size_w i.year i.country i.ec_sector, robust
                              
                              Linear regression                               Number of obs     =      3,299
                                                                              F(38, 3259)       =          .
                                                                              Prob > F          =          .
                                                                              R-squared         =     0.2647
                                                                              Root MSE          =     .18391
                              
                              -----------------------------------------------------------------------------------
                                                |               Robust
                                return_outliers |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                              ------------------+----------------------------------------------------------------
                                    esg_score_w |   -.001392   .0002237    -6.22   0.000    -.0018305   -.0009535
                                        eps_1_w |    .757809   .1387969     5.46   0.000     .4856711    1.029947
                                        eps_2_w |   1.343233   .1893909     7.09   0.000     .9718958     1.71457
                                         size_w |   .0147373   .0037439     3.94   0.000     .0073967    .0220779
                                                |
                                           year |
                                          2011  |   .0135683   .0208592     0.65   0.515    -.0273301    .0544668
                                          2012  |   .2917392   .0214013    13.63   0.000     .2497778    .3337006
                                          2013  |   .3300345   .0215379    15.32   0.000     .2878054    .3722636
                                          2014  |   .2114138   .0202714    10.43   0.000     .1716678    .2511598
                                          2015  |   .2489924   .0210906    11.81   0.000     .2076403    .2903446
                                          2016  |   .2252492   .0212724    10.59   0.000     .1835405    .2669579
                                          2017  |   .2937315   .0222372    13.21   0.000     .2501312    .3373318
                                          2018  |   .2132386   .0209932    10.16   0.000     .1720775    .2543998
                                          2019  |   .3773193     .02058    18.33   0.000     .3369682    .4176704
                                                |
                                        country |
                                           FRA  |   .0052694   .0125008     0.42   0.673    -.0192408    .0297795
                                            NL  |  -.0104605   .0173584    -0.60   0.547     -.044495     .023574
                                           GBR  |  -.0057102   .0121971    -0.47   0.640    -.0296251    .0182046
                                           GER  |  -.0253848   .0144718    -1.75   0.080    -.0537595    .0029899
                                           DNM  |   .0386704   .0223531     1.73   0.084    -.0051571     .082498
                                           BEL  |  -.0298513   .0195516    -1.53   0.127     -.068186    .0084835
                                           ESP  |   .0122164   .0195561     0.62   0.532    -.0261271    .0505599
                                           ITA  |   .0076297   .0196178     0.39   0.697    -.0308347    .0460941
                                           SWD  |   .0113217   .0152837     0.74   0.459     -.018645    .0412884
                                           FIN  |   .0089689   .0206183     0.43   0.664    -.0314572     .049395
                                           NOR  |   .0321693   .0264136     1.22   0.223    -.0196196    .0839582
                                           IRE  |    .011482   .0249939     0.46   0.646    -.0375234    .0604875
                                           POR  |   .0437562   .0367782     1.19   0.234    -.0283544    .1158669
                                          AUST  |  -.0479834   .0353208    -1.36   0.174    -.1172367    .0212698
                                           LUX  |  -.0131475   .0456616    -0.29   0.773    -.1026757    .0763808
                                           POL  |  -.0664197   .0450929    -1.47   0.141    -.1548329    .0219935
                                           CYP  |   .0078155   .1195987     0.07   0.948    -.2266808    .2423117
                                           JER  |    .027114   .0979013     0.28   0.782    -.1648404    .2190684
                                                |
                                      ec_sector |
                                 ConsCyclicals  |   .0076829    .012623     0.61   0.543    -.0170669    .0324326
                              ConsNonCyclicals  |   .0135818   .0123677     1.10   0.272    -.0106675     .037831
                                        Energy  |  -.0075456   .0175671    -0.43   0.668    -.0419893    .0268982
                                    Healthcare  |   .0541731   .0146289     3.70   0.000     .0254903     .082856
                                           Ind  |   .0160107   .0112013     1.43   0.153    -.0059517    .0379731
                                       RealEst  |   .0409051   .0135634     3.02   0.003     .0143114    .0674988
                                          Tech  |    .024592   .0136842     1.80   0.072    -.0022385    .0514226
                                          Util  |   .0262104   .0171516     1.53   0.127    -.0074187    .0598395
                                                |
                                          _cons |  -.4676585   .0906945    -5.16   0.000    -.6454825   -.2898346
                              -----------------------------------------------------------------------------------
                              
                              .

                              Comment

                              Working...
                              X