Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: Pooled v. FE v. RE v. GLS

    I am writing my thesis about CO2 emissions'determinants: lagged CO2, GDP, energy intensity and share of renewable energies into the primary energy.

    I do it:

    1- Pooled OLS

    Code:
    reg ln_co2pc_gr l.ln_co2pc_gr ln_gdppc_gr ei_ch res_share_ch
    estimates store pooled
      
    Source SS df MS Number of obs = 987
    F(4, 982) = 271.95
    Model 5.83456066 4 1.45864016 Prob > F = 0.0000
    Residual 5.26702695 982 .005363571 R-squared = 0.5256
    Adj R-squared = 0.5236
    Total 11.1015876 986 .011259217 Root MSE = .07324
    ln_co2pc_gr Coefficient Std. err. t P>t [95% conf. interval]
    ln_co2pc_gr
    L1. -.1760077 .0223423 -7.88 0.000 -.2198518 -.1321636
    ln_gdppc_gr 1.173666 .0611717 19.19 0.000 1.053623 1.293708
    ei_ch 6.246011 .4014166 15.56 0.000 5.458278 7.033744
    res_share_ch -.0167874 .0008761 -19.16 0.000 -.0185066 -.0150682
    _cons .0000834 .0024518 0.03 0.973 -.0047279 .0048946
    2- Chow Test in the estimation FE - With this test I verified that pooled is better FE

    Code:
    xtreg ln_co2pc_gr l.ln_co2pc_gr ln_gdppc_gr ei_ch res_share_ch,fe
    estimates store fixed
      
    Fixed-effects (within) regression Number of obs = 987
    Group variable: pais Number of groups = 21
    R-squared: Obs per group:
    Within = 0.5277 min = 47
    Between = 0.3574 avg = 47.0
    Overall = 0.5254 max = 47
    F(4,962) = 268.67
    corr(u_i, Xb) = -0.0369 Prob > F = 0.0000
    ln_co2pc_gr Coefficient Std. err. t P>t [95% conf. interval]
    ln_co2pc_gr
    L1. -.182082 .0225374 -8.08 0.000 -.2263102 -.1378538
    ln_gdppc_gr 1.202518 .0628348 19.14 0.000 1.079209 1.325827
    ei_ch 6.177734 .4055494 15.23 0.000 5.381871 6.973597
    res_share_ch -.0166086 .0008886 -18.69 0.000 -.0183525 -.0148648
    _cons -.0001908 .002465 -0.08 0.938 -.0050283 .0046467
    sigma_u .00889845
    sigma_e .07348315
    rho .01445209 (fraction of variance due to u_i)
    F test that all u_i=0: F(20, 962) = 0.67 Prob > F = 0.8574
    3- Modified Wald test for groupwise heteroskedasticity - The result indicates that i have to reject H0. So I have heteroskedasticity.

    Code:
    xttest3 
    Modified Wald test for groupwise heteroskedasticity
    in fixed effect regression model
    H0: sigma(i)^2 = sigma^2 for all i
    chi2 (21) = 5626.37
    Prob>chi2 = 0.0000
    3' - Following FAQ: Testing for panel-level heteroskedasticity and autocorrelation | Stata to test heteroskedasticity - The result indicates that i have to reject H0. So I have heteroskedasticity.

    Code:
    xtgls ln_co2pc_gr l.ln_co2pc_gr ln_gdppc_gr ei_ch res_share_ch, igls panels(heteroskedastic)
    estimates store hetero
      
    Iteration 1: tolerance = .01253158
    Iteration 2: tolerance = .00224603
    Iteration 3: tolerance = .00018464
    Iteration 4: tolerance = .00008188
    Iteration 5: tolerance = .00006792
    Iteration 6: tolerance = .00003587
    Iteration 7: tolerance = .0000168
    Iteration 8: tolerance = 7.513e-06
    Iteration 9: tolerance = 3.295e-06
    Iteration 10: tolerance = 1.432e-06
    Iteration 11: tolerance = 6.200e-07
    Iteration 12: tolerance = 2.679e-07
    Iteration 13: tolerance = 1.157e-07
    Iteration 14: tolerance = 4.992e-08
    Cross-sectional time-series FGLS regression
    Coefficients: generalized least squares
    Panels: heteroskedastic
    Correlation: no autocorrelation
    Estimated covariances = 21 Number of obs = 987
    Estimated autocorrelations = 0 Number of groups = 21
    Estimated coefficients = 5 Time periods = 47
    Wald chi2(4) = 3209.74
    Log likelihood = 1666.573 Prob > chi2 = 0.0000
    ln_co2pc_gr Coefficient Std. err. z P>z [95% conf. interval]
    ln_co2pc_gr
    L1. -.0365769 .0157667 -2.32 0.020 -.067479 -.0056748
    ln_gdppc_gr .966336 .0285631 33.83 0.000 .9103533 1.022319
    ei_ch 6.084885 .1788927 34.01 0.000 5.734262 6.435508
    res_share_ch -.0150411 .0005018 -29.97 0.000 -.0160246 -.0140576
    _cons -.0009792 .0011706 -0.84 0.403 -.0032736 .0013152
    xtgls ln_co2pc_gr l.ln_co2pc_gr ln_gdppc_gr ei_ch res_share_ch, igls
    Iteration 1: tolerance = 0
    Cross-sectional time-series FGLS regression
    Coefficients: generalized least squares
    Panels: homoskedastic
    Correlation: no autocorrelation
    Estimated covariances = 1 Number of obs = 987
    Estimated autocorrelations = 0 Number of groups = 21
    Estimated coefficients = 5 Time periods = 47
    Wald chi2(4) = 1093.35
    Log likelihood = 1182.094 Prob > chi2 = 0.0000
    ln_co2pc_gr Coefficient Std. err. z P>z [95% conf. interval]
    ln_co2pc_gr
    L1. -.1760077 .0222856 -7.90 0.000 -.2196867 -.1323287
    ln_gdppc_gr 1.173666 .0610166 19.24 0.000 1.054075 1.293256
    ei_ch 6.246011 .4003985 15.60 0.000 5.461244 7.030778
    res_share_ch -.0167874 .0008738 -19.21 0.000 -.0185001 -.0150747
    _cons .0000834 .0024455 0.03 0.973 -.0047098 .0048765
    local df = e(N_g) - 1 lrtest hetero . , df(`df')
    Likelihood-ratio test
    Assumption: . nested within hetero
    LR chi2(20) = 968.96
    Prob > chi2 = 0.0000
    4 - Breusch-Pagan LM test for cross-sectional correlation in fixed effects model - The result indicates that i can't reject H0. So I don't have cross-sectional correlation.

    Code:
    xttest2
      
    Correlation matrix of residuals:
    __e1 __e4 __e5 __e6 __e7 __e8 __e10 __e11 __e13 __e14 __e15 __e16 __e17 __e18
    __e1 1.0000
    __e4 0.0230 1.0000
    __e5 0.1774 -0.2663 1.0000
    __e6 -0.0815 -0.1596 0.3337 1.0000
    __e7 0.0378 -0.0339 0.0931 -0.4827 1.0000
    __e8 -0.0775 0.0884 -0.1391 0.0130 -0.2160 1.0000
    __e10 0.1728 -0.1214 0.3537 -0.0791 0.0612 -0.0255 1.0000
    __e11 -0.0227 -0.1197 0.2107 0.1892 0.0045 0.1437 0.1316 1.0000
    __e13 -0.0605 -0.2207 -0.0571 0.0261 0.0191 0.0425 0.0010 0.0492 1.0000
    __e14 0.0869 -0.0064 0.0060 -0.1281 -0.0103 0.0488 0.1306 0.0719 -0.0029 1.0000
    __e15 0.1708 -0.1080 0.0993 0.0243 0.0373 -0.2299 0.1401 -0.0315 -0.1551 0.2435 1.0000
    __e16 0.0628 0.0825 0.0666 0.2075 -0.0526 0.1230 -0.0705 0.0390 -0.0794 0.2468 -0.0093 1.0000
    __e17 0.0355 -0.0747 0.2266 -0.0418 -0.0541 -0.2315 0.2137 -0.0571 0.1571 0.0463 -0.1197 -0.0884 1.0000
    __e18 0.1185 0.1001 0.2537 0.1797 -0.1182 0.2911 -0.0325 0.1856 -0.2174 0.1771 0.1690 0.1377 -0.2270 1.0000
    __e19 -0.1699 -0.2237 0.1343 0.0290 0.0353 0.2445 -0.0164 0.3058 0.1293 0.0199 -0.0604 -0.0287 -0.3070 0.0850
    __e20 -0.2560 -0.0242 0.0560 -0.0847 0.3125 -0.0103 -0.0130 0.1775 0.0206 -0.1220 0.0793 0.2789 0.0166 0.0722
    __e21 0.1885 -0.1850 0.2959 0.0675 0.1458 0.0512 0.2397 0.1864 0.3013 0.0005 -0.0682 -0.0539 0.0846 0.2271
    __e22 0.1250 0.0499 -0.1973 -0.0485 0.1092 0.0224 -0.0751 -0.2712 0.0114 0.0947 0.2086 0.0161 -0.0104 0.0202
    __e23 0.0544 0.0267 0.1008 0.0286 -0.1474 0.0368 0.1464 0.2453 0.0411 0.0314 -0.0384 0.0662 0.0609 0.0481
    __e24 -0.0303 0.1201 -0.0874 0.0920 -0.0274 -0.0487 -0.0419 0.0871 0.0408 0.0059 0.0166 0.0027 0.1309 -0.0986
    __e26 0.4876 -0.0495 0.4233 0.3092 0.0435 -0.0122 0.3230 0.3330 -0.0773 -0.0259 0.2067 0.1298 0.1149 0.1722
    __e19 __e20 __e21 __e22 __e23 __e24 __e26
    __e19 1.0000
    __e20 0.0276 1.0000
    __e21 0.2877 0.1445 1.0000
    __e22 0.0194 0.0504 0.1057 1.0000
    __e23 0.2771 0.0291 0.2581 -0.0727 1.0000
    __e24 -0.0341 0.0722 -0.1132 0.1107 -0.0321 1.0000
    __e26 -0.0919 0.0470 0.2584 0.0817 0.1381 -0.0107 1.0000
    Breusch-Pagan LM test of independence: chi2(210) = 224.533, Pr = 0.2340
    Based on 46 complete observations over panel units
    5 - Estimation with RE

    Code:
    xtreg ln_co2pc_gr l.ln_co2pc_gr ln_gdppc_gr ei_ch res_share_ch, re
    estimates store random
      
    Random-effects GLS regression Number of obs = 987
    Group variable: pais Number of groups = 21
    R-squared: Obs per group:
    Within = 0.5275 min = 47
    Between = 0.3787 avg = 47.0
    Overall = 0.5256 max = 47
    Wald chi2(4) = 1087.81
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
    ln_co2pc_gr Coefficient Std. err. z P>z [95% conf. interval]
    ln_co2pc_gr
    L1. -.1760077 .0223423 -7.88 0.000 -.2197977 -.1322176
    ln_gdppc_gr 1.173666 .0611717 19.19 0.000 1.053771 1.29356
    ei_ch 6.246011 .4014166 15.56 0.000 5.459249 7.032773
    res_share_ch -.0167874 .0008761 -19.16 0.000 -.0185044 -.0150703
    _cons .0000834 .0024518 0.03 0.973 -.004722 .0048887
    sigma_u 0
    sigma_e .07348315
    rho 0 (fraction of variance due to u_i)
    6 - tests of overidentifying restrictions - Why fail?

    Code:
    xtoverid
      
    Error - saved RE estimates are degenerate (sigma_u=0) and equivalent to pooled OLS
    r(198);
    7 - Breusch Pagan Test - With this test I verified that pooled is better than RE

    Code:
    xttest0
      
    Breusch and Pagan Lagrangian multiplier test for random effects
    ln_co2pc_gr[pais,t] = Xb + u[pais] + e[pais,t]
    Estimated results:
    Var SD = sqrt(Var)
    ln_co2p~r .0112592 .1061095
    e .0053998 .0734831
    u 0 0
    Test: Var(u) = 0
    chibar2(01) = 0.00
    Prob > chibar2 = 1.0000
    8 - Hausman Test - With this test I verified that FE is better than RE

    Code:
    hausman fixed random, sigmamore 
    Coefficients ----
    (b) (B) (b-B) sqrt(diag(V_b V_B))
    fixed random Difference Std. err.
    ln_co2pc_gr
    L1. -.182082 -.1760077 -.0060743 .0023138
    ln_gdppc_gr 1.202518 1.173666 .0288517 .0134072
    ei_ch 6.177734 6.246011 -.0682771 .0472479
    res_share_ch -.0166086 -.0167874 .0001788 .0001298
    b = Consistent under H0 and Ha; obtained from xtreg.
    B = Inconsistent under Ha, efficient under H0; obtained from xtreg.
    Test of H0: Difference in coefficients not systematic
    chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
    = 12.28
    Prob > chi2 = 0.0154
    9 - Wooldrigde Test for autocorrelation in panel data - i can reject H0. So I have first-order autocorrelation

    Code:
    xtserial ln_co2pc_gr ln_co2pc_gr_1 ln_gdppc_gr ei_ch res_share_ch 
    Wooldridge test for autocorrelation in panel data
    H0: no first-order autocorrelation
    F( 1, 20) = 46.802
    Prob > F = 0.0000
    So I decide to do it:

    10 – The last step was the estimation with xtgls with the option panels (heteroskedastic) and corr(ar1).

    Code:
    xtgls ln_co2pc_gr ln_co2pc_gr_1 ln_gdppc_gr ei_ch res_share_ch, panels(heteroskedastic) corr(ar1)
      
    Cross-sectional time-series FGLS regression
    Coefficients: generalized least squares
    Panels: heteroskedastic
    Correlation: common AR(1) coefficient for all panels (-0.0149)
    Estimated covariances = 21 Number of obs = 1,008
    Estimated autocorrelations = 1 Number of groups = 21
    Estimated coefficients = 5 Time periods = 48
    Wald chi2(4) = 2909.76
    Prob > chi2 = 0.0000
    ln_co2pc_gr Coefficient Std. err. z P>z [95% conf. interval]
    ln_co2pc_gr_1 -.0412811 .0166903 -2.47 0.013 -.0739934 -.0085687
    ln_gdppc_gr .9839163 .0317807 30.96 0.000 .9216273 1.046205
    ei_ch 6.192212 .1930138 32.08 0.000 5.813912 6.570512
    res_share_ch -.0155283 .0005325 -29.16 0.000 -.0165719 -.0144847
    _cons -.0007851 .0012703 -0.62 0.537 -.0032749 .0017046
    Is it wise to use xtgls or are better options?

    Thanks in advance,

    Sebastián.



  • #2
    Sebastián:
    I'm under the impression that yours is no the most fruitful way to tackle this issue.
    That said:
    1) if you actually have panel data (with a continuous regressand), why starting from -regress-?
    2) the usual approach is to compare -fe- with -re- specification (that is, using -xtreg- if you have a N>T panel dataset);
    3) what does the literature in your research field suggest?
    4) last but not least, as this one is not yor first message on this forum, please use CODE delimiters to share what you typed and what Stata gave you back (as per FAQ). Thanks.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Carlos:

      Thank you for your fast reply.

      1) I started from regress because I I would like to have all the alternatives,

      2) Sorry, I should explain more. I have 21 countries and anual data between 1971-2019 so N = 21 and T = 48.

      3) You can read: Barrera-Santana, J., Marrero, G. A., Puch, L. A., & Díaz, A. (2021). CO2 emissions and energy technologies in Western Europe. SERIEs, 12(2), 105-150.

      4) Do you mean that only what I type goes between the CODE delimiters and Stata gave me back outside the CODE limiters?

      Sorry for my English.

      Regards,

      Sebastián.

      Comment


      • #4
        Carlo:

        Thank you for your fast reply.

        1) I started from regress because I I would like to have all the alternatives,

        2) Sorry, I should explain more. I have 21 countries and anual data between 1971-2019 so N = 21 and T = 48.

        3) You can read: Barrera-Santana, J., Marrero, G. A., Puch, L. A., & Díaz, A. (2021). CO2 emissions and energy technologies in Western Europe. SERIEs, 12(2), 105-150.

        4) Do you mean that only what I type goes between the CODE delimiters and Stata gave me back outside the CODE limiters?

        Sorry for my English.

        Regards,

        Sebastián.

        Comment


        • #5
          Sebastian:
          1) and 2): I do not think starting from (pooled) OLS is the way to go here, as you are dealing with T>N panel dataset. Pooled OLS is usually the last resort, when data do not support the evidence of apanel-wise effect.
          3) the paper you quoted focus on dynamic panel data model (see -help xtabond-), which is a different (and much more demanding) inferential procedure vs -xtgls- or -xtregar- (that you shoul consider when dealing with long panel datasets). In addition,
          https://www.stata.com/bookstore/environmental-econometrics-using-stata might be interesting to read.
          4)
          Code:
          I mean that you should include among CODE delimiters what you typed and what Stata gave you back. Just click on the #-shaped toggle available from the tool bar appearing at the top of the post
          Many/Most of the listers are not American/British English mother-tongue (I'm clearly a case in point); I think a very useful by-product of participating to this forum is reading and understanding how English mother-tongue listers phrase (and think), which is, in my case, really different from Italian.
          Once apologizing for my far-from-Oxonian English, Nick Cox humourously replied: "Don't worry, I've studied at Cambridge!".
          Last edited by Carlo Lazzaro; 13 Feb 2022, 03:39.
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Carlo:

            Molto grazie! Thank you so much!

            I like too much Environmental Econometrics Using Stata.

            I read that xtabond can be applied when N>T, but I have T>N, and fixed effect were lost If I use xtgls.

            My Stata is 17 BE.

            Greetings,

            Sebastián.

            Comment


            • #7
              Sebastian:
              you may want to take a look at: http://fmwww.bc.edu/EC-C/S2013/823/E...n05.slides.pdf
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Carlo,

                Thank you again Carlo!

                I reformulated the problem in new post in https://www.statalist.org/forums/for...-or-panel-ardl.

                Regards,

                Sebastián.

                Comment

                Working...
                X