Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multicollinearity Tests - Panel Data

    Hello all,

    I am currently working with a panel dataset on Stata18 and have some questions regarding multicollinearity. After reading various threads I am still slightly confused. I understand many are of the view that multicollinearity is not a huge issue, however, I am still hoping to explore and potentially mitigate the impact on my results (I have obtained very weird/large coefficients and st errors). For context, I am running a fixed effects model exploring the impact of ESG on the weighted average cost of capital (with various control variables).

    'estat vif' does not seem to work on panel models. I tried running a simple 'pwcorr [variable list]'. However, after reading some suggestions, I tried the 'estat vce, corr' instead, which has given me mixed results. I am unsure whether to use robust standard errors in the xtreg before looking at correlations. When including robust standard errors, my correlations are much higher and indicate issues with multiple variables.

    Could someone please help me understand which approach is most appropriate? Here are my differing results:

    pwcorr test:

    Code:
    pwcorr ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT ln_GDPPC
    
    
                 |      ESG   ZSCORE      ASG      CUR      LEV   INTCOV      PTB
    -------------+---------------------------------------------------------------
             ESG |   1.0000
          ZSCORE |  -0.1697   1.0000
             ASG |   0.0097   0.0200   1.0000
             CUR |  -0.1018   0.4814  -0.0129   1.0000
             LEV |  -0.0713  -0.2297   0.0104  -0.3133   1.0000
          INTCOV |  -0.0131   0.0083  -0.0039   0.1752  -0.0908   1.0000
             PTB |  -0.2647   0.5836   0.0371   0.1336   0.0336  -0.0173   1.0000
             ROA |  -0.0258   0.3018   0.0936   0.4127  -0.3549   0.0726   0.2418
              TQ |  -0.1953   0.8527   0.0491   0.2905  -0.1539  -0.0097   0.7732
              BT |  -0.0720  -0.0847  -0.0459   0.0405   0.1231   0.0258   0.1270
           SIGMA |   0.1472   0.1000  -0.0148  -0.0030  -0.1160  -0.0442   0.1231
          LEGALV |  -0.3317  -0.0707   0.0071   0.1256   0.0534   0.0228  -0.1322
            INFL |   0.1222  -0.0579   0.0401  -0.0791   0.0117  -0.0257  -0.0463
           COVID |   0.0563  -0.0326   0.0156   0.0251  -0.0305   0.0445  -0.0435
         ln_SIZE |   0.5180  -0.0835   0.0220  -0.2159   0.0886  -0.0576  -0.0273
          ln_SRT |   0.4794  -0.1011   0.0193  -0.2684   0.0235  -0.0583  -0.0172
        ln_GDPPC |  -0.0885  -0.1063   0.0161   0.1304  -0.0574   0.0840  -0.1558
    
                 |      ROA       TQ       BT    SIGMA   LEGALV     INFL    COVID
    -------------+---------------------------------------------------------------
             ROA |   1.0000
              TQ |   0.3384   1.0000
              BT |  -0.1035  -0.1214   1.0000
           SIGMA |   0.0106   0.0847   0.1636   1.0000
          LEGALV |   0.0020  -0.0694   0.0161  -0.5173   1.0000
            INFL |   0.0791  -0.0463  -0.0466   0.1869   0.0038   1.0000
           COVID |   0.0339  -0.0328   0.0081  -0.0598   0.1672   0.2005   1.0000
         ln_SIZE |  -0.1353  -0.0866  -0.0251   0.2992  -0.7132   0.0543  -0.1201
          ln_SRT |  -0.1195  -0.1021  -0.0335   0.3013  -0.7447   0.0673  -0.1189
        ln_GDPPC |   0.1024  -0.1197  -0.0105  -0.3868   0.6927   0.3854   0.3054
    
                 |  ln_SIZE   ln_SRT ln_GDPPC
    -------------+---------------------------
         ln_SIZE |   1.0000
          ln_SRT |   0.9701   1.0000
        ln_GDPPC |  -0.4701  -0.4727   1.0000
    estat vce, corr (Without robust standard errors) :

    Code:
    xtreg WACC ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT ln_GDPPC,fe
    
    Fixed-effects (within) regression               Number of obs     =      1,019 Group variable: ID                              Number of groups  =         51
      
      R-squared:                                      Obs per group:
           Within  = 0.2454                                         min =         19
           Between = 0.1379                                         avg =       20.0
           Overall = 0.1392                                         max =         20
      
                                                      F(17, 951)        =      18.19
      corr(u_i, Xb) = -0.6622                         Prob > F          =     0.0000
      
      ------------------------------------------------------------------------------
              WACC | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               ESG |   .1266163   .0328428     3.86   0.000     .0621637     .191069
            ZSCORE |   .0468535    .041237     1.14   0.256    -.0340725    .1277794
               ASG |    -.00337   .0029078    -1.16   0.247    -.0090764    .0023365
               CUR |  -.9584892   .2990543    -3.21   0.001    -1.545372   -.3716065
               LEV |  -.2248722   .0287947    -7.81   0.000    -.2813807   -.1683637
            INTCOV |   .0000104   4.46e-06     2.34   0.020     1.68e-06    .0000192
               PTB |   .2069231   .1579241     1.31   0.190     -.102997    .5168431
               ROA |  -.0347309    .030474    -1.14   0.255     -.094535    .0250732
                TQ |   -.145192   .2926098    -0.50   0.620    -.7194275    .4290434
                BT |   .8209768    .331237     2.48   0.013      .170937    1.471017
             SIGMA |    13.7027   1.630689     8.40   0.000     10.50254    16.90287
            LEGALV |  -10.79952   3.204504    -3.37   0.001    -17.08824   -4.510804
              INFL |  -.0445178    .027403    -1.62   0.105    -.0982951    .0092596
             COVID |   -.000032    .000038    -0.84   0.400    -.0001067    .0000426
           ln_SIZE |   2.708554     1.0963     2.47   0.014     .5571075    4.860001
            ln_SRT |  -2.930164   .6583161    -4.45   0.000    -4.222084   -1.638244
          ln_GDPPC |   5.581056   2.122431     2.63   0.009     1.415865    9.746246
             _cons |  -32.63803    16.1497    -2.02   0.044    -64.33119   -.9448737
      -------------+----------------------------------------------------------------
           sigma_u |  4.5866087
           sigma_e |  4.1834419
               rho |  .54587392   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0: F(50, 951) = 7.78                     Prob > F = 0.0000
    Code:
    estat vce,corr
      Correlation matrix of coefficients of xtreg model
      
              e(V) |      ESG    ZSCORE       ASG       CUR       LEV    INTCOV       PTB       ROA        TQ        BT     SIGMA    LEGALV      INFL     COVID   ln_SIZE    ln_SRT  ln_GDPPC     _cons
      -------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
               ESG |   1.0000                                                                                                                                                                          
            ZSCORE |   0.0153    1.0000                                                                                                                                                                
               ASG |   0.0368   -0.0222    1.0000                                                                                                                                                      
               CUR |  -0.0303   -0.6145   -0.0054    1.0000                                                                                                                                            
               LEV |   0.0073    0.0909   -0.0281   -0.1710    1.0000                                                                                                                                  
            INTCOV |  -0.0260   -0.0679    0.0007    0.0988   -0.0126    1.0000                                                                                                                        
               PTB |  -0.0901    0.1738   -0.1125    0.1115   -0.2725   -0.0090    1.0000                                                                                                              
               ROA |   0.0448    0.0579   -0.0288   -0.1046    0.1375    0.0130   -0.2367    1.0000                                                                                                    
                TQ |   0.0981   -0.5633    0.0690    0.2545    0.1675    0.0415   -0.7312    0.1796    1.0000                                                                                          
                BT |  -0.0893   -0.0315    0.0206   -0.0663    0.0101    0.0107   -0.3408    0.1105    0.2357    1.0000                                                                                
             SIGMA |  -0.0251    0.0712    0.0314   -0.1232    0.0942    0.0498   -0.0799    0.0999    0.0483   -0.1128    1.0000                                                                      
            LEGALV |   0.0132   -0.0158    0.0199    0.0039   -0.0335    0.0522   -0.0213    0.0838   -0.0031    0.0165    0.1549    1.0000                                                            
              INFL |  -0.1078   -0.0585    0.0210   -0.0209    0.0569    0.0023   -0.1396   -0.0240    0.1069    0.0891   -0.2451    0.1852    1.0000                                                  
             COVID |  -0.1718   -0.0387    0.0022    0.0207    0.0109   -0.0194   -0.0793    0.0180    0.0526    0.0142   -0.0283   -0.0477    0.0303    1.0000                                        
           ln_SIZE |  -0.2173    0.1367   -0.1293    0.0919   -0.1666   -0.0042    0.6837   -0.1119   -0.4886   -0.0669   -0.1631   -0.0731   -0.1672   -0.0791    1.0000                              
            ln_SRT |   0.0400    0.0528   -0.0465   -0.0068    0.1136    0.0032   -0.0068   -0.2084   -0.0175    0.0147    0.1492    0.1539   -0.0217   -0.0014   -0.4483    1.0000                    
          ln_GDPPC |  -0.0808   -0.0383    0.0236    0.0481   -0.0039    0.0017   -0.1040    0.0639    0.0953   -0.0085    0.1299   -0.1527   -0.7622   -0.0764   -0.0045   -0.1644    1.0000          
             _cons |   0.1439   -0.0210    0.0476   -0.1186    0.0305   -0.0120   -0.2138    0.0201    0.1181    0.0154   -0.0973   -0.0265    0.7323    0.1398   -0.3645    0.1324   -0.8841    1.0000

    estat vce, corr (with robust):

    Code:
    xtreg WACC ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT ln_GDPPC,fe vce(cluster ID)
    
    
    Fixed-effects (within) regression               Number of obs     =      1,019
    Group variable: ID                              Number of groups  =         51
    
    R-squared:                                      Obs per group:
         Within  = 0.2454                                         min =         19
         Between = 0.1379                                         avg =       20.0
         Overall = 0.1392                                         max =         20
    
                                                    F(17, 50)         =    1021.32
    corr(u_i, Xb) = -0.6622                         Prob > F          =     0.0000
    
                                        (Std. err. adjusted for 51 clusters in ID)
    ------------------------------------------------------------------------------
                 |               Robust
            WACC | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             ESG |   .1266163   .0478849     2.64   0.011     .0304367     .222796
          ZSCORE |   .0468535   .0729644     0.64   0.524    -.0996998    .1934067
             ASG |    -.00337   .0019584    -1.72   0.091    -.0073035    .0005635
             CUR |  -.9584892   .9860384    -0.97   0.336    -2.939006    1.022027
             LEV |  -.2248722    .118422    -1.90   0.063    -.4627298    .0129854
          INTCOV |   .0000104   7.78e-07    13.42   0.000     8.88e-06     .000012
             PTB |   .2069231   .1354552     1.53   0.133    -.0651467    .4789928
             ROA |  -.0347309   .0377041    -0.92   0.361    -.1104617        .041
              TQ |   -.145192    .353533    -0.41   0.683     -.855284    .5648999
              BT |   .8209768   .4355486     1.88   0.065    -.0538484    1.695802
           SIGMA |    13.7027   11.22082     1.22   0.228    -8.834986    36.24039
          LEGALV |  -10.79952   4.890098    -2.21   0.032    -20.62157   -.9774691
            INFL |  -.0445178   .0362501    -1.23   0.225    -.1173283    .0282928
           COVID |   -.000032   .0000444    -0.72   0.474    -.0001212    .0000571
         ln_SIZE |   2.708554   1.455806     1.86   0.069    -.2155177    5.632627
          ln_SRT |  -2.930164   2.054601    -1.43   0.160    -7.056952    1.196624
        ln_GDPPC |   5.581056   2.906683     1.92   0.061    -.2571898     11.4193
           _cons |  -32.63803   16.12961    -2.02   0.048     -65.0353   -.2407657
    -------------+----------------------------------------------------------------
         sigma_u |  4.5866087
         sigma_e |  4.1834419
             rho |  .54587392   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    Code:
    estat vce,corr
    
    Correlation matrix of coefficients of xtreg model
    
            e(V) |      ESG    ZSCORE       ASG       CUR       LEV    INTCOV       PTB       ROA        TQ        BT     SIGMA    LEGALV      INFL     COVID   ln_SIZE    ln_SRT  ln_GDPPC     _cons
    -------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
             ESG |   1.0000                                                                                                                                                                          
          ZSCORE |   0.2785    1.0000                                                                                                                                                                
             ASG |   0.2475    0.3166    1.0000                                                                                                                                                      
             CUR |  -0.3215   -0.9772   -0.3006    1.0000                                                                                                                                            
             LEV |  -0.4208   -0.8343   -0.2069    0.8798    1.0000                                                                                                                                  
          INTCOV |   0.0219   -0.0329   -0.1878   -0.0066   -0.2599    1.0000                                                                                                                        
             PTB |  -0.1889    0.1814    0.1335   -0.1129   -0.2361   -0.0513    1.0000                                                                                                              
             ROA |  -0.0417   -0.4253   -0.2902    0.5078    0.5498   -0.1301   -0.3822    1.0000                                                                                                    
              TQ |  -0.0772   -0.9000   -0.3320    0.8222    0.7310    0.0386   -0.4771    0.4502    1.0000                                                                                          
              BT |  -0.2923    0.1306   -0.0610   -0.1276   -0.1681    0.1829   -0.0683   -0.1792   -0.1104    1.0000                                                                                
           SIGMA |   0.3788    0.8802    0.2269   -0.9170   -0.9248    0.3610    0.1475   -0.5347   -0.7367    0.0814    1.0000                                                                      
          LEGALV |   0.1032   -0.2246   -0.2094    0.2271    0.2420    0.2724   -0.3723    0.2980    0.2843    0.1084   -0.2146    1.0000                                                            
            INFL |  -0.3218   -0.5684   -0.3296    0.5675    0.6453   -0.2035   -0.2911    0.5176    0.6128    0.0214   -0.6540    0.4680    1.0000                                                  
           COVID |  -0.3726   -0.7362   -0.3834    0.7686    0.8126   -0.2461   -0.2784    0.5972    0.6926   -0.0350   -0.8186    0.2869    0.7201    1.0000                                        
         ln_SIZE |  -0.1140    0.5756    0.1838   -0.5634   -0.5899    0.0483    0.7590   -0.5698   -0.6927    0.2380    0.5636   -0.4586   -0.4662   -0.5786    1.0000                              
          ln_SRT |  -0.4170   -0.8333   -0.2345    0.9028    0.8979   -0.1387   -0.1567    0.5095    0.6689   -0.0194   -0.9101    0.3034    0.5537    0.7789   -0.6391    1.0000                    
        ln_GDPPC |   0.4301    0.7509    0.2183   -0.8070   -0.8097    0.2233   -0.0020   -0.5016   -0.6140    0.0020    0.8468   -0.2595   -0.7824   -0.7546    0.4143   -0.8336    1.0000          
           _cons |  -0.2556   -0.6716   -0.1489    0.6927    0.7098   -0.3187   -0.2356    0.5420    0.6393   -0.1668   -0.7551    0.1001    0.7255    0.6905   -0.5823    0.7094   -0.8918    1.0000
    Last edited by Ethan Kilmer; 21 Jun 2024, 12:06.

  • #2
    year fixed effect?

    Comment


    • #3
      you can convert your xtreg to just reg to use estat vif.

      reg y x , absorb(id)
      estat vif

      Comment


      • #4
        I was unsure whether the year fixed effect was relevant for the multicollinearity test.

        If running the 'estat vce, corr', should I include the robust standard error in the xtreg?

        Thanks!

        Comment


        • #5
          Not sure, but my intuition tells me to use the cluster robust errors, because you will in the final model. I'd put year in there, and switch to reghdfe.

          Code:
          ssc install reghdfe
          reghdfe WACC ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT ln_GDPPC, absorb(ID year) cluster(ID)
          But run this to see what you get. Keep in mind, this will not detect multicollinearity of linear combinations.
          Code:
           qui reg WACC ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT ln_GDPPC i.id i.year
          estat vif
          Also do this to exclude each variable to see how the results change. Might help you track down a problem.

          Code:
          global XMAIN ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT
          local i = 1
          foreach var in ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT {
              global XEX "`var'"
              global X : list global(XMAIN) - global(XEX)
              eststo e`i': reghdfe WACC $X , absorb(ID year) cluster(ID)
              local i = `i'  +1
          }
          esttab e1 e2 e3 e4 e5 e6 e7 e8
          esttab e9 e10 e11 e12 e13 e14 e16 e16

          Comment


          • #6
            If you have a few main variables of interest, you can also include those in the regression and then add the other variables to determine when the problem starts.

            Comment


            • #7
              The reg option will not work. It makes a mess of it. Ignore that.

              Can get VIFs this way.

              Code:
              global ALL ESG ZSCORE ASG CUR LEV INTCOV PTB ROA TQ BT SIGMA LEGALV INFL COVID ln_SIZE ln_SRT
              foreach var in $ALL {
                  quietly {
                  global Y "`var'"
                  global X : list global(ALL) - global(Y)
                  reghdfe $Y $X , absorb(ID year) cluster(ID)
                  }
                  di "VIF_`var'" _col(20) %5.3f 1/(1-e(r2_within))
              }

              Comment


              • #8
                Okay I will try this out! Is there any particular reason to use reghdfe rather than xtreg y x, fe?

                I really appreciate your input

                Comment


                • #9
                  It absorbs multiple fixed effects, providing a cleaner output. (Though the latest version of Stata can do so as well, but I'm still trying to figure out why the SE are so different).

                  Comment


                  • #10
                    estat vce, corr gives you the correlations among the parameter estimates, not the x. That’s very different. I can’t tell if the coefficients are large because you haven’t shown us units of your Y variable. Coefficients on logged vars often appear “large” because often it only makes sense to change the log by, say, 0.01 (a one percent increase in X).

                    Comment


                    • #11
                      Thank you for the clarification.

                      My dependent variable (ESG) is on a scale of 0-100.

                      "Coefficients on logged vars often appear “large” because often it only makes sense to change the log by, say, 0.01 (a one percent increase in X)" - so it makes sense to take the log by 0.01 rather than just "ln(variable)"? Could you provide me with the code for this?

                      Comment


                      • #12
                        I'd start with TQ/ZSCORE and ln_SIZE/ln_SRT. Both have high correlations (the latter 0.97).

                        Comment


                        • #13
                          Yes, I am currently investigating the correlations and impacts of including these variables one by one.

                          In regards to my remaining data, does it make sense to take the regular natural logarithm of variables with large positive values (ln(variable))? For example, total assets of a firm, or GDP per capita? I am just slightly confused by Jeff's comment on 0.01.

                          Again, thanks for the input!!
                          Last edited by Ethan Kilmer; 22 Jun 2024, 09:25.

                          Comment


                          • #14
                            George, I have some more questions regarding reghdfe. FYI, inn comparison to xtreg, it seems to provide me with more significant coefficients which I am happy with (thanks for the suggestion!!!).

                            1. Can I use interaction terms within this method?
                            2. Is it still considered FE regression? I am wondering how to explain this in my methodology.
                            3. Is interpretation of results the same?
                            4. for the 'absorb (year) portion, should I instead do 'absorb (Q_DATE) - my variables are measured quarterly.

                            Here is my output:

                            Code:
                             
                            reghdfe WACC i.LEGAL#c.ESG TQ ASG CUR LEV ROA BT INFL ln_SIZE, absorb(Q_DATE) cluster(ID)
                            
                            (MWFE estimator converged in 1 iterations)
                            
                            HDFE Linear regression                            Number of obs   =      1,019
                            Absorbing 1 HDFE group                            F(  10,     50) =       7.31
                            Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                              R-squared       =     0.2295
                                                                              Adj R-squared   =     0.2069
                                                                              Within R-sq.    =     0.1547
                            Number of clusters (ID)      =         51         Root MSE        =     5.0744
                            
                                                                (Std. err. adjusted for 51 clusters in ID)
                            ------------------------------------------------------------------------------
                                         |               Robust
                                    WACC | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                            -------------+----------------------------------------------------------------
                             LEGAL#c.ESG |
                                      0  |   .0885032   .0465419     1.90   0.063    -.0049789    .1819853
                                      1  |   .0578929   .0377316     1.53   0.131    -.0178934    .1336791
                                         |
                                      TQ |   .2733341   .1296202     2.11   0.040     .0129842    .5336841
                                     ASG |  -.0006754   .0027836    -0.24   0.809    -.0062665    .0049157
                                     CUR |   .5938003   .2274227     2.61   0.012     .1370084    1.050592
                                     LEV |  -.0764465    .037516    -2.04   0.047    -.1517995   -.0010934
                                     ROA |    .039027   .0642866     0.61   0.547    -.0900964    .1681503
                                      BT |   1.418311    .508703     2.79   0.007     .3965507    2.440071
                                    INFL |   -.068703   .0985207    -0.70   0.489    -.2665876    .1291816
                                 ln_SIZE |  -.0954152   .2038758    -0.47   0.642    -.5049117    .3140813
                                   _cons |   12.35335   11.99887     1.03   0.308    -11.74709    36.45379
                            ------------------------------------------------------------------------------
                            
                            Absorbed degrees of freedom:
                            -----------------------------------------------------+
                             Absorbed FE | Categories  - Redundant  = Num. Coefs |
                            -------------+---------------------------------------|
                                  Q_DATE |        20           0          20     |
                            -----------------------------------------------------+
                            
                            . margins LEGAL,dydx(ESG)
                            
                            Average marginal effects                                 Number of obs = 1,019
                            Model VCE: Robust
                            
                            Expression: Linear prediction, predict()
                            dy/dx wrt:  ESG
                            
                            ------------------------------------------------------------------------------
                                         |            Delta-method
                                         |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
                            -------------+----------------------------------------------------------------
                            ESG          |
                                   LEGAL |
                                      0  |   .0885032   .0465419     1.90   0.057    -.0027172    .1797236
                                      1  |   .0578929   .0377316     1.53   0.125    -.0160598    .1318455
                            ------------------------------------------------------------------------------

                            Comment


                            • #15
                              Ethan: But your use of reghdfe means you didn't include unit fixed effects. You only included time effects. You need to absorb(ID Q_DATE). Personally, in this sort of application I see no reason to use reghdfre. You can use xtreg with i.Q_Date with the fe option:

                              Code:
                               
                               xtreg WACC i.LEGAL#c.ESG TQ ASG CUR LEV ROA BT INFL ln_SIZE i.Q_DATE, fe vce(cluster ID)
                              This should reproduce reghdfe.

                              There's not code needed for what I was talking about. The point is that you are not going to increase ln(X) by one, as that means well over a 100% increase in X. You want to change ln(X) by a smaller amount when interpreting the coefficient.

                              Is there a reason you don't have LEGAL and ESG appearing by themselves? Unless they don't ever change across time, it's unusual to have interactions but not levels.

                              Comment

                              Working...
                              X