Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Performing Auto-correlation Tests on Fixed Effects Model with Unbalanced Panel Data

    Hello all,

    I kindly want to preface this post by saying I am new to STATA and have referred to other threads regarding similar issues but am still at loss--in any case, I apologize in advance for any obvious/poorly phrased questions.

    I am working with a sample of just over 1900 observations of 183 firms, with data ranging from fiscal years 1996-2018. I suppose the panel is unbalanced because some firms encompass data from fiscal years spanning from 1996 through 2018, while others spanned only one, or a few, fiscal years (this was the sample I was provided). The goal is to compare the performance of different types of firms (indicated by a dummy variable).

    I took the first step with the following commands:
    . xtset gvkey fiscalyear, yearly
    . xtdescribe
    . tsfill

    The gvkey is the code used to identify each firm.

    Then I performed the Breusch-Pagan LM test for random effects versus OLS model, rejection of null indicated RE instead of pooled OLS.
    Following this I did the Hausman test for fixed versus random effects model, rejection of the null indicated FE instead of RE.

    Great, so now I choose to use the following regression: (for simplification, I cut out the performance measure, the variable of intereste, and all the controls)
    . xtreg y x, fe

    Naturally, I want to check the regression for auto-correlation and heteroskedasticity. A previous test I used on reg xy, "estat imtest, white", indicated there was heteroskedasticity, and "(. xtserial x y)", indicated my results had serial correlation

    I used "xttest3" after the regression, to find there is heteroskedasticity. BUT, when I tried to use "xttest2", I got the following response: Error: too few common observations across panel.
    no observations. I tried to use " xtcsd, pesaran" instead, because I thought it would work with an unbalanced panel, but also got the response "Error: The panel is highly unbalanced.
    Not enough common observations across panel to perform Pesaran's test. insufficient observations". So I am not sure how to address the issue of the unbalanced panel in order to test for auto-correlation after using a fixed effects model.

    In any case, (assuming there is auto-correlation), I proceeded to use: ". xtreg x y, fe vce(robust)" and "xtreg x y, fe vce(cluster gvkey)" as a remedy (gvkey is the code used to identify each firm) How can I interpret the results of these regressions to know which option is better? As far as I can tell, the results are the same, and both are indicative of a fe model.

    So in sum:
    1) how can I test for auto-correlation of a fe model when I am running into these issued with an unbalanced panel. I want to be able to justify using the following models?
    2) which of the two options, vce(robust) or vce(cluster gvkey), yields more robust results (if either)? It seems to depend on a case by case basis, so I kindly wanted to ask if you all had any recommendations.

    If there's any other information I can provide, please let me know! A big thank you in advance to anyone who can help!

    Best,

    Evi




  • #2
    Evialina:
    welcome to this forum.
    1) results stemming from imposing robustified or clustered standard errors are, as expected, the same since under -xtreg- robust clustered standard errors is what you got from both of these options (that deal with both heteroskedasticity and/or autocorrelation).
    As a a consequence, your question #2) becomes immaterial.
    Another issue that creeps up reading your (a bit too long) post is that your time dimension is not negligible (hence serial correlation is to be expected; conversely, you do not seem to be concerned about possible across-panel correlation): hence, you may want to consider something like -xtregar-, which offers both -fe- and -re- option.
    As an aside, as recommended by FAQ in your future post please share what you typed and what Stata gave you back via CODE delimiters. Being statistics a matter of quantities, numbers and codes worth much more than tons of words. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo, thank you for the quick response. Sorry about the long post. As I understood, you would recommend something like

      Code:
       . xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe
      (Tobin's Q is my dep., family_firm is the indep. var. I'm most interested in, followed by controls).

      After I apply this, I then used Hausman again (is that okay?) to see whether re or fe fits better. When I do, this is the output,

      Code:
      . . quietly xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equi
      > ty_Pay, fe
      
      . 
      . . estimates store fixed 
      
      . 
      . . quietly xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equi
      > ty_Pay, re
      
      . 
      . . estimates store random 
      
      . 
      . . hausman fixed random
      
                       ---- Coefficients ----
                   |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                   |     fixed        random       Difference          S.E.
      -------------+----------------------------------------------------------------
       family_firm |    -1.45598    -.0672304       -1.388749        .3016183
      Research_S~s |    .3188283     .6468545       -.3280262        .0535327
         LTDebt_TA |    .7797611     .4687123        .3110488        .3957004
          CAPX_PPE |   -.2780882    -.0035541       -.2745341         .889918
       Debt_MktVal |   -.2039972    -.4104208        .2064236        .0627548
      lntotalass~s |   -.1558742    -.1102616       -.0456126        .2956125
          Firm_Age |    1.017311     -.065807        1.083118        .6340213
      morethan5o~r |    .2875581      .775148       -.4875899        .3239442
      morethan5o~y |    1.097396    -.0732028        1.170599        .3008809
      CEO_Equity~y |   -1.215814    -1.886376         .670562        .8521356
      ------------------------------------------------------------------------------
                               b = consistent under Ho and Ha; obtained from xtregar
                B = inconsistent under Ha, efficient under Ho; obtained from xtregar
      
          Test:  Ho:  difference in coefficients not systematic
      
                       chi2(10) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                =     4642.36
                      Prob>chi2 =      0.0000
      So in this case, I would stick with fe. Would there then be an additional test I do for heterskedasticity/auto-correlation?

      As for cross sectional dependencies across my panel, I tried to apply Pesaran's test, but got the following:

      Code:
      . . quietly xtreg Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe
      
      . 
      . . xtcsd, pesaran show
       
      Error: The panel is highly unbalanced.
      Not enough common observations across panel to perform Pesaran's test.
      insufficient observations
      Is there another way to test for it? Assuming it exists, I saw one can use Panel Corrected Standard Error (PCSE) to address it. I had adjusted my performance measure for each industry by subtracting the mean value performance of each industry for each fiscal year, and subtracted it from each observation. I thought that would help with cross-panel effects in my data. If I had to choose to control for cross sectional dependencies or serial correlation, my intuition is that serial correlation is a bigger concern? Or may it is better to stick to -xtreg- but to cluster the firms by their industry?

      Thank you again for your help Carlo, I really appreciate it!

      Comment


      • #4
        Evialina:
        the issue here is that a pretty long time-series dimension allows you to modelize the autocorrelation (AR1 with -xtregar-); conversely, -xtregar- does not support heteroskedastcity-robust standard errors. In this respect, -xtreg- sounds better when you impose cluster/robust standard error.
        As far as correlation across panels is concerned, you can test for it with -xtpcse- (see Example #3, -xtpcse- entry, Stata .pdf manual).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you again, I noticed a significant difference with my results between implementing -xtregar- and -xtreg- (with cluster/robust standard error imposed). Whereas the variable of interest (family_firm) was insignificant with -xtreg-, it became significant at the .1% level with -xtregar-

          Code:
          . xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe
          
          FE (within) regression with AR(1) disturbances  Number of obs     =        284
          Group variable: gvkey                           Number of groups  =         30
          
          R-sq:                                           Obs per group:
               within  = 0.0680                                         min =          1
               between = 0.1362                                         avg =        9.5
               overall = 0.0449                                         max =         20
          
                                                          F(10,244)         =       1.78
          corr(u_i, Xb)  = -0.8203                        Prob > F          =     0.0648
          
          --------------------------------------------------------------------------------------------
                         Adj_TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ---------------------------+----------------------------------------------------------------
                         family_firm |   -1.45598   .4335852    -3.36   0.001    -2.310027   -.6019322
                      Research_Sales |   .3188283   .2208669     1.44   0.150    -.1162208    .7538774
                           LTDebt_TA |   .7797611   .7449268     1.05   0.296    -.6875466    2.247069
                            CAPX_PPE |  -.2780882   1.480276    -0.19   0.851    -3.193839    2.637663
                         Debt_MktVal |  -.2039972   .1849174    -1.10   0.271    -.5682353    .1602408
                       lntotalassets |  -.1558742   .3098104    -0.50   0.615    -.7661183    .4543699
                            Firm_Age |   1.017311   .6477304     1.57   0.118    -.2585458    2.293168
          morethan5ownership_founder |   .2875581   .4884075     0.59   0.557    -.6744748    1.249591
           morethan5ownership_family |   1.097396   .4879125     2.25   0.025     .1363382    2.058454
                      CEO_Equity_Pay |  -1.215814   1.871552    -0.65   0.517    -4.902273    2.470645
                               _cons |  -2.184534   .7270062    -3.00   0.003    -3.616543   -.7525251
          ---------------------------+----------------------------------------------------------------
                              rho_ar |  .66687635
                             sigma_u |  1.6961688
                             sigma_e |  .80046898
                             rho_fov |   .8178516   (fraction of variance because of u_i)
          --------------------------------------------------------------------------------------------
          F test that all u_i=0: F(29,244) = 2.61                      Prob > F = 0.0000
          
          . 
          . 
          . . xtreg Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe
          >  vce(cluster gvkey)
          
          Fixed-effects (within) regression               Number of obs     =        316
          Group variable: gvkey                           Number of groups  =         32
          
          R-sq:                                           Obs per group:
               within  = 0.1276                                         min =          1
               between = 0.0076                                         avg =        9.9
               overall = 0.0290                                         max =         21
          
                                                          F(10,31)          =      97.83
          corr(u_i, Xb)  = -0.9304                        Prob > F          =     0.0000
          
                                                         (Std. Err. adjusted for 32 clusters in gvkey)
          --------------------------------------------------------------------------------------------
                                     |               Robust
                         Adj_TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ---------------------------+----------------------------------------------------------------
                         family_firm |    .125311    .768062     0.16   0.871    -1.441162    1.691784
                      Research_Sales |  -.2878188    .087537    -3.29   0.003    -.4663517   -.1092859
                           LTDebt_TA |  -1.373468    .820206    -1.67   0.104    -3.046289    .2993528
                            CAPX_PPE |   1.581573   2.807291     0.56   0.577    -4.143935    7.307082
                         Debt_MktVal |  -.1908685   .2496387    -0.76   0.450    -.7000099     .318273
                       lntotalassets |   .0278072    .437456     0.06   0.950    -.8643901    .9200045
                            Firm_Age |   2.331083   1.130333     2.06   0.048     .0257541    4.636412
          morethan5ownership_founder |   1.372737   .8121856     1.69   0.101    -.2837265      3.0292
           morethan5ownership_family |   .2977501   .9755678     0.31   0.762    -1.691934    2.287434
                      CEO_Equity_Pay |  -3.013912   1.969073    -1.53   0.136    -7.029863     1.00204
                               _cons |  -8.681163   5.939512    -1.46   0.154    -20.79488    3.432552
          ---------------------------+----------------------------------------------------------------
                             sigma_u |  2.8398231
                             sigma_e |   1.045575
                                 rho |  .88062369   (fraction of variance due to u_i)
          --------------------------------------------------------------------------------------------
          I tried to use -xtpcse- but ran into some problems again, so I also tried -xtcd- , which still didn't work.

          Code:
          . xtpcse Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay
          
          Number of gaps in sample:  1
          no time periods are common to all panels, cannot estimate disturbance
          covariance matrix using casewise inclusion
          r(459);
          
          . xtcd  Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay
          too many variables specified
          r(103);
          
          . . xtcd Adj_TobinsQ family_firm
          Error: The panel is highly unbalanced.
          Not enough common observations across panel to perform Pesaran's test.
          insufficient observations
          r(2001);
          I'm at a loss for what else I can do concerning correlation across panels. Is there another way? But in any case, if there was a test that indicated cross-panel correlation, the solution would still pose some sort of trade off to correcting for serial correlation, or?

          Comment


          • #6
            Sorry, as a follow up: am I running into these issues also in part because I have a smaller T relative to N? Am I less likely to run into cross-panel issues with a smaller T and larger N?

            Comment


            • #7
              Evialina:
              as you can see -xtreg- -xtregar- use a different number of observations (due to AR calculation in the latter).
              Your panel is probably too unbalanced to check for correlation across panels (and the reason of your drawbacks are not those reported in #6).
              However, since you're focusing on the -fe- estimator, the goal of your research is investigating variation within the same panel as time goes by: hence, across panel correlation may be a minor issue.
              As an aside, if you're supervised by a professor/teacher/mentor discuss all these issues with her/him, in order to avoid problems as your research progresses.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Hello Carlo, thank you for your help. Your points clarified a lot for me, and I’ll take up some of these concerns with my advisor. I appreciate your time and your contributions to the forum!

                Best wishes and have a wonderful day,

                Evi

                Comment


                • #9
                  Thanks, you too.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment

                  Working...
                  X