Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Addressing Heteroskedasticity, Autocorrelation, and Endogeneity in FEM with Micro Panel Data in Stata

    Hello everyone,
    I am currently working with micro panel data and have encountered some issues in my analysis. After selecting the appropriate model, I performed the Hausman test, which indicated that the Fixed Effects Model (FEM) is the preferred choice. Subsequently, I tested for heteroskedasticity and autocorrelation in the FEM, and the results confirmed the presence of both issues. Additionally, based on a study I referenced, endogeneity is also likely present in the data. However, I am unsure how to test for endogeneity in this context.

    Given these findings, how should I address heteroskedasticity, autocorrelation, and endogeneity in the FEM? I am using Stata for my analysis. Thank you.
    Last edited by Doan Ngan; 02 Jan 2025, 08:02.

  • #2
    clustered errors will address the hetero and autocorr. Endogeneity will require some type of IV approach. What sort of variables are the DV and potentially endogenous variable?

    Comment


    • #3
      Originally posted by George Ford View Post
      clustered errors will address the hetero and autocorr. Endogeneity will require some type of IV approach. What sort of variables are the DV and potentially endogenous variable?
      Thank you for your response.

      Dependent Variable (DV): Green Investment (GI), proxied by total nuclear, renewables, and other energy production.

      Independent Variables (IVs): Information and Communications Technology (ICT), Financial Development (FD), GDP per capita, CO2 emissions, Human Capital (HC), Trade Openness, Financial Globalization, and Natural Resources Rents (NRR).

      Potential Endogenous Variable: Financial Development (FD) is suspected to be endogenous due to potential reverse causality, as green investment could also influence financial development.

      The context of my study focuses on the impact of ICT and financial development on green investment in highly polluted economies. Data spans from 2000 to 2021, covering 88 countries.

      Could you suggest appropriate instrumental variables or methodologies to test and address the endogeneity of FD in this case?

      Comment


      • #4
        Everything is continuous so it's straightforward. ivreghdfe or ivreg2. You'll need instruments to test exogeneity. I might look for stuff that affects basic consumer banking since that's far Green Investment. Looks like WorldBank has ATM/1000000, bank branches, etc.....

        Both ivreg2 and ivreghdfe can provide a test of exogeneity (add endog(FD) as an option).

        I'm not convinced it's a problem. Motives for energy investments are unlike normal financial transactions. In many countries the government funds it, and if there was some market problem with getting financing, the government would step in. That is, the dependent variable has a strong policy influence, and nuclear energy is not off-the-shelf.

        Also, CO2emmissions in endogenous in that model, no?

        Comment


        • #5
          I’m not entirely sure whether variables like CO2 emissions or FD (financial development) are actually endogenous. The paper I referenced (link) uses a 2SLS approach to address potential endogeneity but does not explicitly state which variables are considered endogenous or what instruments were used to resolve the issue.

          That’s why I tried treating each variable as endogenous one by one and used their lags as instruments. Specifically, I ran the model using ivreg2 ..., endog() in Stata. The results showed that log(GDP) (lGDP) is endogenous.
          Code:
           ivreg2 GI FD ICT lCO2 lHC lTrade lFG NRR (lGDP = l.lGDP), endog(lGDP)
          Warning: time variable year has 25 gap(s) in relevant range
          
          IV (2SLS) estimation
          --------------------
          
          Estimates efficient for homoskedasticity only
          Statistics consistent for homoskedasticity only
          
                                                                Number of obs =     1430
                                                                F(  8,  1421) =   107.06
                                                                Prob > F      =   0.0000
          Total (centered) SS     =  4878.936376                Centered R2   =   0.3763
          Total (uncentered) SS   =  5435.226302                Uncentered R2 =   0.4402
          Residual SS             =  3042.769701                Root MSE      =    1.459
          
          ------------------------------------------------------------------------------
                    GI | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                  lGDP |   .3024666   .0823188     3.67   0.000     .1411246    .4638085
                    FD |  -.1702853   .3646959    -0.47   0.641    -.8850762    .5445055
                   ICT |   -.114932   .0707552    -1.62   0.104    -.2536096    .0237456
                  lCO2 |   .6241343   .0394167    15.83   0.000     .5468789    .7013897
                   lHC |  -1.349843   .2074737    -6.51   0.000    -1.756484   -.9432023
                lTrade |  -.8112085   .0910336    -8.91   0.000    -.9896311   -.6327859
                   lFG |   1.634508   .2746078     5.95   0.000     1.096287    2.172729
                   NRR |  -.0135545     .00445    -3.05   0.002    -.0222763   -.0048327
                 _cons |  -5.910915   1.160105    -5.10   0.000     -8.18468   -3.637151
          ------------------------------------------------------------------------------
          Underidentification test (Anderson canon. corr. LM statistic):        1421.020
                                                             Chi-sq(1) P-val =    0.0000
          ------------------------------------------------------------------------------
          Weak identification test (Cragg-Donald Wald F statistic):              2.2e+05
          Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                                   15% maximal IV size              8.96
                                                   20% maximal IV size              6.66
                                                   25% maximal IV size              5.53
          Source: Stock-Yogo (2005).  Reproduced by permission.
          ------------------------------------------------------------------------------
          Sargan statistic (overidentification test of all instruments):           0.000
                                                           (equation exactly identified)
          -endog- option:
          Endogeneity test of endogenous regressors:                               2.804
                                                             Chi-sq(1) P-val =    0.0940
          Regressors tested:    lGDP
          ------------------------------------------------------------------------------
          Instrumented:         lGDP
          Included instruments: FD ICT lCO2 lHC lTrade lFG NRR
          Excluded instruments: L.lGDP
          ------------------------------------------------------------------------------
          However, when I followed a method I found on a forum to test for endogeneity, the results indicated that lCO2 and ICT are endogenous instead.
          Code:
          xtreg ICT FD lGDP lCO2 lHC lTrade lFG NRR L.ICT, fe 
          
          Fixed-effects (within) regression               Number of obs     =      1,426
          Group variable: id                              Number of groups  =         79
          
          R-squared:                                      Obs per group:
               Within  = 0.9854                                         min =          4
               Between = 0.9898                                         avg =       18.1
               Overall = 0.9869                                         max =         21
          
                                                          F(8,1339)         =   11265.06
          corr(u_i, Xb) = 0.2210                          Prob > F          =     0.0000
          
          ------------------------------------------------------------------------------
                   ICT | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                    FD |  -.0365899   .0497154    -0.74   0.462    -.1341186    .0609387
                  lGDP |  -.0194949   .0196783    -0.99   0.322    -.0580985    .0191088
                  lCO2 |   .0414793    .012239     3.39   0.001     .0174696    .0654889
                   lHC |  -.0534944   .0238189    -2.25   0.025    -.1002209    -.006768
                lTrade |   .0170831   .0152012     1.12   0.261    -.0127375    .0469038
                   lFG |   .0504013   .0255411     1.97   0.049     .0002964    .1005063
                   NRR |   .0045693   .0006247     7.31   0.000     .0033439    .0057947
                       |
                   ICT |
                   L1. |   .8904805   .0052027   171.16   0.000      .880274    .9006869
                       |
                 _cons |   .2663344   .2223763     1.20   0.231    -.1699095    .7025782
          -------------+----------------------------------------------------------------
               sigma_u |   .0738328
               sigma_e |  .07114689
                   rho |  .51851976   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0: F(78, 1339) = 2.14                    Prob > F = 0.0000
          
          . predict e1, e
          (334 missing values generated)
          
          . xtreg lCO2 ICT lGDP FD lHC lTrade lFG NRR L.lCO2, fe
          
          Fixed-effects (within) regression               Number of obs     =      1,432
          Group variable: id                              Number of groups  =         79
          
          R-squared:                                      Obs per group:
               Within  = 0.9106                                         min =          4
               Between = 0.9959                                         avg =       18.1
               Overall = 0.9939                                         max =         21
          
                                                          F(8,1345)         =    1713.43
          corr(u_i, Xb) = 0.4964                          Prob > F          =     0.0000
          
          ------------------------------------------------------------------------------
                  lCO2 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                   ICT |  -.0013211   .0053219    -0.25   0.804    -.0117611    .0091189
                  lGDP |   .0643995   .0179453     3.59   0.000     .0291957    .0996032
                    FD |  -.0519122   .0459337    -1.13   0.259    -.1420216    .0381972
                   lHC |   .0250903   .0218544     1.15   0.251    -.0177822    .0679627
                lTrade |   .0129446   .0140771     0.92   0.358    -.0146708      .04056
                   lFG |  -.0310768   .0235728    -1.32   0.188    -.0773201    .0151666
                   NRR |   .0012009   .0005762     2.08   0.037     .0000706    .0023313
                       |
                  lCO2 |
                   L1. |   .9462465   .0116915    80.93   0.000      .923311     .969182
                       |
                 _cons |   .0208739   .2054028     0.10   0.919    -.3820708    .4238186
          -------------+----------------------------------------------------------------
               sigma_u |  .10734546
               sigma_e |  .06587119
                   rho |  .72645335   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0: F(78, 1345) = 2.66                    Prob > F = 0.0000
          
          . predict e2, e
          (328 missing values generated)
          
          . xtreg GI lGDP FD lHC lTrade lFG NRR e1 e2, fe cluster(id)
          
          Fixed-effects (within) regression               Number of obs     =      1,425
          Group variable: id                              Number of groups  =         79
          
          R-squared:                                      Obs per group:
               Within  = 0.0601                                         min =          4
               Between = 0.0555                                         avg =       18.0
               Overall = 0.0575                                         max =         21
          
                                                          F(8,78)           =       2.27
          corr(u_i, Xb) = 0.0031                          Prob > F          =     0.0306
          
                                              (Std. err. adjusted for 79 clusters in id)
          ------------------------------------------------------------------------------
                       |               Robust
                    GI | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                  lGDP |   .3980835   .1488902     2.67   0.009     .1016658    .6945011
                    FD |   -.049109   .4678615    -0.10   0.917    -.9805496    .8823317
                   lHC |    .072506   .0818205     0.89   0.378    -.0903862    .2353981
                lTrade |   -.016467   .1121024    -0.15   0.884    -.2396458    .2067118
                   lFG |  -.2653458   .2651809    -1.00   0.320    -.7932804    .2625889
                   NRR |   .0008462   .0025865     0.33   0.744    -.0043031    .0059955
                    e1 |  -.1090508   .0631722    -1.73   0.088     -.234817    .0167153
                    e2 |  -.1818542    .074515    -2.44   0.017    -.3302021   -.0335063
                 _cons |  -2.224472   2.000074    -1.11   0.269    -6.206314     1.75737
          -------------+----------------------------------------------------------------
               sigma_u |  1.6813962
               sigma_e |  .26983173
                   rho |  .97489255   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          I’m not sure which method is correct. Can you help clarify this for me?




          Comment


          • #6
            The paper uses xtabond2 with lag(lGDP) as a regressor.

            one-year lags on these persistent series do not make good instruments. use xtabond2.

            Comment


            • #7
              ivqregress can do the IV quantile part.

              Comment


              • #8
                Thank you very much for your support and guidance in addressing my questions over the past time. I truly appreciate your help and the time you've taken to assist me.

                Comment

                Working...
                X