Hello, I am working on my undergraduate dissertation and am looking at the effects of ESG on financial performance for US energy firms, running pooled OLS, fixed effects and Arellano bond estimators.
Currently, I am struggling to decide the amount of years to explore. I had originally decided to explore a period of 5 years after discussing with my supervisor and being warned about how unbalanced panel data could bias dynamic panel estimators such as Arellano Bond, but after reading the paper Baltagi, B.H. and Chang, Y.-J. (1994) ‘Incomplete panels’, Journal of Econometrics, 62(2), pp. 67–89. doi:10.1016/0304-4076(94)90017-5. , which finds attempting to make the data balanced by dropping observations worsens the performance of estimators compared to using the entire unbalanced data set.
Also, I am planning to run all my regressions with 2 specifications, one including the control variable "R&D intensity" and one without, as there are many missing observations (RD_TR in summary statistics table), however, it is an important variable used in past literature, and I will be mentioning this as a limitation of my study, is this fine?
Furthermore, when testing for heteroskedasticity for my models, I use xttest3 in Stata 17.0, however, I find no difference after using vce(robust) and am unsure as to why. Here I am using data from 2018-2023 and have dropped all firms which did not report their CO2 emissions.
Here is a summary statistic table for before and after dropping observations based on the 2 criteria mentioned above:
Currently, I am struggling to decide the amount of years to explore. I had originally decided to explore a period of 5 years after discussing with my supervisor and being warned about how unbalanced panel data could bias dynamic panel estimators such as Arellano Bond, but after reading the paper Baltagi, B.H. and Chang, Y.-J. (1994) ‘Incomplete panels’, Journal of Econometrics, 62(2), pp. 67–89. doi:10.1016/0304-4076(94)90017-5. , which finds attempting to make the data balanced by dropping observations worsens the performance of estimators compared to using the entire unbalanced data set.
Code:
tabulate year has_esg | has_esg year | 0 1 | Total -----------+----------------------+---------- 2004 | 87 18 | 105 2005 | 90 27 | 117 2006 | 96 32 | 128 2007 | 100 33 | 133 2008 | 99 39 | 138 2009 | 97 44 | 141 2010 | 89 61 | 150 2011 | 98 67 | 165 2012 | 98 74 | 172 2013 | 104 77 | 181 2014 | 108 81 | 189 2015 | 107 86 | 193 2016 | 111 94 | 205 2017 | 101 116 | 217 2018 | 71 155 | 226 2019 | 60 170 | 230 2020 | 51 184 | 235 2021 | 27 216 | 243 2022 | 10 236 | 246 2023 | 0 247 | 247 -----------+----------------------+---------- Total | 1,604 2,057 | 3,661
Code:
tabulate year has_RD | has_RD year | 0 1 | Total -----------+----------------------+---------- 2004 | 76 29 | 105 2005 | 85 32 | 117 2006 | 94 34 | 128 2007 | 98 35 | 133 2008 | 100 38 | 138 2009 | 96 45 | 141 2010 | 105 45 | 150 2011 | 111 54 | 165 2012 | 118 54 | 172 2013 | 125 56 | 181 2014 | 130 59 | 189 2015 | 132 61 | 193 2016 | 141 64 | 205 2017 | 152 65 | 217 2018 | 156 70 | 226 2019 | 160 70 | 230 2020 | 164 71 | 235 2021 | 166 77 | 243 2022 | 168 78 | 246 2023 | 169 78 | 247 -----------+----------------------+---------- Total | 2,546 1,115 | 3,661
Here is a summary statistic table for before and after dropping observations based on the 2 criteria mentioned above:
Code:
summarize TOBIN ROA ESGC ENV SOC GOV lnCO2 EMIS Target Prod RU Policy RD_TR DE BETA SIZE age Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- TOBIN | 3,182 .9839965 .0682519 .2757434 2.460593 ROA | 1,844 .0270842 .1612215 -4.7646 .99519 ESGC | 2,057 36.47043 18.90526 .9054831 88.83961 ENV | 2,057 29.20133 26.18804 0 96.92313 SOC | 2,057 36.77819 22.12286 .4434122 94.85254 -------------+--------------------------------------------------------- GOV | 2,057 52.02294 23.23856 .2800454 98.42676 lnCO2 | 996 14.13439 2.451057 1.699279 18.88316 EMIS | 2,057 37.09499 31.56243 0 99.68553 Target | 1,882 20.42819 35.29809 0 95.91837 Prod | 3,661 .5605026 .4963937 0 1 -------------+--------------------------------------------------------- RU | 2,057 31.88244 31.62478 0 99.79839 Policy | 3,661 .7437859 .4366011 0 1 RD_TR | 1,115 .8232445 12.81121 -.00055 339.7368 DE | 3,104 51.49517 29.2946 3 100 BETA | 1,698 1.710485 1.005974 -3.574506 7.031454 -------------+--------------------------------------------------------- SIZE | 3,497 21.17943 2.214364 6.907755 26.63424 age | 3,601 17.14524 18.58871 0 141 . drop if has_lnCO2 == 0 (2,665 observations deleted) . keep if year >=2018 (336 observations deleted) . summarize TOBIN ROA ESGC ENV SOC GOV lnCO2 EMIS Target Prod RU Policy RD_TR DE BETA SIZE age Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- TOBIN | 655 .9682978 .0600428 .4543625 1.213569 ROA | 451 .0341375 .0863776 -.5233 .4139 ESGC | 660 49.2028 16.77136 9.576389 88.83961 ENV | 660 46.56343 22.21136 .2285714 96.92313 SOC | 660 49.90482 21.10299 6.186469 94.84564 -------------+--------------------------------------------------------- GOV | 660 59.54962 22.04239 .2800454 96.5379 lnCO2 | 660 13.56501 2.529294 1.699279 18.5429 EMIS | 660 59.99171 24.13653 0 99.0625 Target | 647 39.07833 39.55785 0 93.89313 Prod | 660 .3075758 .4618399 0 1 -------------+--------------------------------------------------------- RU | 660 51.82752 27.55294 0 99.79839 Policy | 660 .8651515 .3418207 0 1 RD_TR | 259 .0323892 .1052471 .0000315 1.066443 DE | 646 53.43344 25.08295 3 100 BETA | 637 1.884157 .9794677 -.8407099 6.850461 -------------+--------------------------------------------------------- SIZE | 660 22.51583 1.594792 17.68055 26.63424 age | 647 20.85781 22.66347 0 141
Code:
xtreg TOBIN ENV SOC GOV RD_TR DE BETA SIZE age, fe Fixed-effects (within) regression Number of obs = 278 Group variable: ID Number of groups = 58 R-squared: Obs per group: Within = 0.6058 min = 1 Between = 0.4386 avg = 4.8 Overall = 0.5556 max = 8 F(8,212) = 40.73 corr(u_i, Xb) = -0.4431 Prob > F = 0.0000 ------------------------------------------------------------------------------ TOBIN | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- ENV | -.000227 .0001361 -1.67 0.097 -.0004954 .0000413 SOC | .0001672 .0001384 1.21 0.228 -.0001057 .0004401 GOV | .0000361 .000101 0.36 0.722 -.0001631 .0002352 RD_TR | -.0295279 .012995 -2.27 0.024 -.0551439 -.0039119 DE | -.0012643 .0000784 -16.12 0.000 -.0014189 -.0011097 BETA | -.0040603 .0016222 -2.50 0.013 -.0072581 -.0008625 SIZE | -.0009862 .0036487 -0.27 0.787 -.0081785 .0062062 age | .0004106 .0006199 0.66 0.508 -.0008112 .0016325 _cons | 1.058147 .0831861 12.72 0.000 .8941694 1.222125 -------------+---------------------------------------------------------------- sigma_u | .02791698 sigma_e | .01557575 rho | .76260957 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(57, 212) = 5.39 Prob > F = 0.0000 . xttest3 Modified Wald test for groupwise heteroskedasticity in fixed effect regression model H0: sigma(i)^2 = sigma^2 for all i chi2 (58) = 1.7e+29 Prob>chi2 = 0.0000 . xtreg TOBIN ENV SOC GOV RD_TR DE BETA SIZE age, fe vce(robust) Fixed-effects (within) regression Number of obs = 278 Group variable: ID Number of groups = 58 R-squared: Obs per group: Within = 0.6058 min = 1 Between = 0.4386 avg = 4.8 Overall = 0.5556 max = 8 F(8,57) = 33.75 corr(u_i, Xb) = -0.4431 Prob > F = 0.0000 (Std. err. adjusted for 58 clusters in ID) ------------------------------------------------------------------------------ | Robust TOBIN | Coefficient std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- ENV | -.000227 .0001707 -1.33 0.189 -.0005689 .0001148 SOC | .0001672 .0001588 1.05 0.297 -.0001507 .0004852 GOV | .0000361 .000114 0.32 0.753 -.0001922 .0002644 RD_TR | -.0295279 .0253025 -1.17 0.248 -.0801953 .0211395 DE | -.0012643 .0001046 -12.09 0.000 -.0014738 -.0010549 BETA | -.0040603 .0013741 -2.95 0.005 -.006812 -.0013087 SIZE | -.0009862 .0050542 -0.20 0.846 -.011107 .0091347 age | .0004106 .0006892 0.60 0.554 -.0009694 .0017907 _cons | 1.058147 .113507 9.32 0.000 .8308535 1.285441 -------------+---------------------------------------------------------------- sigma_u | .02791698 sigma_e | .01557575 rho | .76260957 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xttest3 Modified Wald test for groupwise heteroskedasticity in fixed effect regression model H0: sigma(i)^2 = sigma^2 for all i chi2 (58) = 1.7e+29 Prob>chi2 = 0.0000