Performing Auto-correlation Tests on Fixed Effects Model with Unbalanced Panel Data

Evialina Yakimovich

Join Date: Nov 2019

Posts: 5
#1

Performing Auto-correlation Tests on Fixed Effects Model with Unbalanced Panel Data

05 Nov 2019, 06:42

Hello all,

I kindly want to preface this post by saying I am new to STATA and have referred to other threads regarding similar issues but am still at loss--in any case, I apologize in advance for any obvious/poorly phrased questions.

I am working with a sample of just over 1900 observations of 183 firms, with data ranging from fiscal years 1996-2018. I suppose the panel is unbalanced because some firms encompass data from fiscal years spanning from 1996 through 2018, while others spanned only one, or a few, fiscal years (this was the sample I was provided). The goal is to compare the performance of different types of firms (indicated by a dummy variable).

I took the first step with the following commands:
. xtset gvkey fiscalyear, yearly
. xtdescribe
. tsfill

The gvkey is the code used to identify each firm.

Then I performed the Breusch-Pagan LM test for random effects versus OLS model, rejection of null indicated RE instead of pooled OLS.
Following this I did the Hausman test for fixed versus random effects model, rejection of the null indicated FE instead of RE.

Great, so now I choose to use the following regression: (for simplification, I cut out the performance measure, the variable of intereste, and all the controls)
. xtreg y x, fe

Naturally, I want to check the regression for auto-correlation and heteroskedasticity. A previous test I used on reg xy, "estat imtest, white", indicated there was heteroskedasticity, and "(. xtserial x y)", indicated my results had serial correlation

I used "xttest3" after the regression, to find there is heteroskedasticity. BUT, when I tried to use "xttest2", I got the following response: Error: too few common observations across panel.
no observations. I tried to use " xtcsd, pesaran" instead, because I thought it would work with an unbalanced panel, but also got the response "Error: The panel is highly unbalanced.
Not enough common observations across panel to perform Pesaran's test. insufficient observations". So I am not sure how to address the issue of the unbalanced panel in order to test for auto-correlation after using a fixed effects model.

In any case, (assuming there is auto-correlation), I proceeded to use: ". xtreg x y, fe vce(robust)" and "xtreg x y, fe vce(cluster gvkey)" as a remedy (gvkey is the code used to identify each firm) How can I interpret the results of these regressions to know which option is better? As far as I can tell, the results are the same, and both are indicative of a fe model.

So in sum:
1) how can I test for auto-correlation of a fe model when I am running into these issued with an unbalanced panel. I want to be able to justify using the following models?
2) which of the two options, vce(robust) or vce(cluster gvkey), yields more robust results (if either)? It seems to depend on a case by case basis, so I kindly wanted to ask if you all had any recommendations.

If there's any other information I can provide, please let me know! A big thank you in advance to anyone who can help!

Best,

Evi
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

05 Nov 2019, 07:28

Evialina:
welcome to this forum.
1) results stemming from imposing robustified or clustered standard errors are, as expected, the same since under -xtreg- robust clustered standard errors is what you got from both of these options (that deal with both heteroskedasticity and/or autocorrelation).
As a a consequence, your question #2) becomes immaterial.
Another issue that creeps up reading your (a bit too long) post is that your time dimension is not negligible (hence serial correlation is to be expected; conversely, you do not seem to be concerned about possible across-panel correlation): hence, you may want to consider something like -xtregar-, which offers both -fe- and -re- option.
As an aside, as recommended by FAQ in your future post please share what you typed and what Stata gave you back via CODE delimiters. Being statistics a matter of quantities, numbers and codes worth much more than tons of words. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment

Evialina Yakimovich

Join Date: Nov 2019
Posts: 5

05 Nov 2019, 10:38

Hello Carlo, thank you for the quick response. Sorry about the long post. As I understood, you would recommend something like

Code:

 . xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe

(Tobin's Q is my dep., family_firm is the indep. var. I'm most interested in, followed by controls).

After I apply this, I then used Hausman again (is that okay?) to see whether re or fe fits better. When I do, this is the output,

Code:

. . quietly xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equi
> ty_Pay, fe

. 
. . estimates store fixed 

. 
. . quietly xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equi
> ty_Pay, re

. 
. . estimates store random 

. 
. . hausman fixed random

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |     fixed        random       Difference          S.E.
-------------+----------------------------------------------------------------
 family_firm |    -1.45598    -.0672304       -1.388749        .3016183
Research_S~s |    .3188283     .6468545       -.3280262        .0535327
   LTDebt_TA |    .7797611     .4687123        .3110488        .3957004
    CAPX_PPE |   -.2780882    -.0035541       -.2745341         .889918
 Debt_MktVal |   -.2039972    -.4104208        .2064236        .0627548
lntotalass~s |   -.1558742    -.1102616       -.0456126        .2956125
    Firm_Age |    1.017311     -.065807        1.083118        .6340213
morethan5o~r |    .2875581      .775148       -.4875899        .3239442
morethan5o~y |    1.097396    -.0732028        1.170599        .3008809
CEO_Equity~y |   -1.215814    -1.886376         .670562        .8521356
------------------------------------------------------------------------------
                         b = consistent under Ho and Ha; obtained from xtregar
          B = inconsistent under Ha, efficient under Ho; obtained from xtregar

    Test:  Ho:  difference in coefficients not systematic

                 chi2(10) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =     4642.36
                Prob>chi2 =      0.0000

So in this case, I would stick with fe. Would there then be an additional test I do for heterskedasticity/auto-correlation?

As for cross sectional dependencies across my panel, I tried to apply Pesaran's test, but got the following:

Code:

. . quietly xtreg Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe

. 
. . xtcsd, pesaran show
 
Error: The panel is highly unbalanced.
Not enough common observations across panel to perform Pesaran's test.
insufficient observations

Is there another way to test for it? Assuming it exists, I saw one can use Panel Corrected Standard Error (PCSE) to address it. I had adjusted my performance measure for each industry by subtracting the mean value performance of each industry for each fiscal year, and subtracted it from each observation. I thought that would help with cross-panel effects in my data. If I had to choose to control for cross sectional dependencies or serial correlation, my intuition is that serial correlation is a bigger concern? Or may it is better to stick to -xtreg- but to cluster the firms by their industry?

Thank you again for your help Carlo, I really appreciate it!

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#4

05 Nov 2019, 11:22

Evialina:
the issue here is that a pretty long time-series dimension allows you to modelize the autocorrelation (AR1 with -xtregar-); conversely, -xtregar- does not support heteroskedastcity-robust standard errors. In this respect, -xtreg- sounds better when you impose cluster/robust standard error.
As far as correlation across panels is concerned, you can test for it with -xtpcse- (see Example #3, -xtpcse- entry, Stata .pdf manual).

Kind regards,
Carlo
(Stata 19.0)
Comment

Evialina Yakimovich

Join Date: Nov 2019
Posts: 5

05 Nov 2019, 12:52

Thank you again, I noticed a significant difference with my results between implementing -xtregar- and -xtreg- (with cluster/robust standard error imposed). Whereas the variable of interest (family_firm) was insignificant with -xtreg-, it became significant at the .1% level with -xtregar-

Code:

. xtregar Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe

FE (within) regression with AR(1) disturbances  Number of obs     =        284
Group variable: gvkey                           Number of groups  =         30

R-sq:                                           Obs per group:
     within  = 0.0680                                         min =          1
     between = 0.1362                                         avg =        9.5
     overall = 0.0449                                         max =         20

                                                F(10,244)         =       1.78
corr(u_i, Xb)  = -0.8203                        Prob > F          =     0.0648

--------------------------------------------------------------------------------------------
               Adj_TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------------+----------------------------------------------------------------
               family_firm |   -1.45598   .4335852    -3.36   0.001    -2.310027   -.6019322
            Research_Sales |   .3188283   .2208669     1.44   0.150    -.1162208    .7538774
                 LTDebt_TA |   .7797611   .7449268     1.05   0.296    -.6875466    2.247069
                  CAPX_PPE |  -.2780882   1.480276    -0.19   0.851    -3.193839    2.637663
               Debt_MktVal |  -.2039972   .1849174    -1.10   0.271    -.5682353    .1602408
             lntotalassets |  -.1558742   .3098104    -0.50   0.615    -.7661183    .4543699
                  Firm_Age |   1.017311   .6477304     1.57   0.118    -.2585458    2.293168
morethan5ownership_founder |   .2875581   .4884075     0.59   0.557    -.6744748    1.249591
 morethan5ownership_family |   1.097396   .4879125     2.25   0.025     .1363382    2.058454
            CEO_Equity_Pay |  -1.215814   1.871552    -0.65   0.517    -4.902273    2.470645
                     _cons |  -2.184534   .7270062    -3.00   0.003    -3.616543   -.7525251
---------------------------+----------------------------------------------------------------
                    rho_ar |  .66687635
                   sigma_u |  1.6961688
                   sigma_e |  .80046898
                   rho_fov |   .8178516   (fraction of variance because of u_i)
--------------------------------------------------------------------------------------------
F test that all u_i=0: F(29,244) = 2.61                      Prob > F = 0.0000

. 
. 
. . xtreg Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay, fe
>  vce(cluster gvkey)

Fixed-effects (within) regression               Number of obs     =        316
Group variable: gvkey                           Number of groups  =         32

R-sq:                                           Obs per group:
     within  = 0.1276                                         min =          1
     between = 0.0076                                         avg =        9.9
     overall = 0.0290                                         max =         21

                                                F(10,31)          =      97.83
corr(u_i, Xb)  = -0.9304                        Prob > F          =     0.0000

                                               (Std. Err. adjusted for 32 clusters in gvkey)
--------------------------------------------------------------------------------------------
                           |               Robust
               Adj_TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------------+----------------------------------------------------------------
               family_firm |    .125311    .768062     0.16   0.871    -1.441162    1.691784
            Research_Sales |  -.2878188    .087537    -3.29   0.003    -.4663517   -.1092859
                 LTDebt_TA |  -1.373468    .820206    -1.67   0.104    -3.046289    .2993528
                  CAPX_PPE |   1.581573   2.807291     0.56   0.577    -4.143935    7.307082
               Debt_MktVal |  -.1908685   .2496387    -0.76   0.450    -.7000099     .318273
             lntotalassets |   .0278072    .437456     0.06   0.950    -.8643901    .9200045
                  Firm_Age |   2.331083   1.130333     2.06   0.048     .0257541    4.636412
morethan5ownership_founder |   1.372737   .8121856     1.69   0.101    -.2837265      3.0292
 morethan5ownership_family |   .2977501   .9755678     0.31   0.762    -1.691934    2.287434
            CEO_Equity_Pay |  -3.013912   1.969073    -1.53   0.136    -7.029863     1.00204
                     _cons |  -8.681163   5.939512    -1.46   0.154    -20.79488    3.432552
---------------------------+----------------------------------------------------------------
                   sigma_u |  2.8398231
                   sigma_e |   1.045575
                       rho |  .88062369   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------------

I tried to use -xtpcse- but ran into some problems again, so I also tried -xtcd- , which still didn't work.

Code:

. xtpcse Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay

Number of gaps in sample:  1
no time periods are common to all panels, cannot estimate disturbance
covariance matrix using casewise inclusion
r(459);

. xtcd  Adj_TobinsQ family_firm Research_Sales LTDebt_TA CAPX_PPE Debt_MktVal lntotalassets Firm_Age morethan5ownership_founder morethan5ownership_family CEO_Equity_Pay
too many variables specified
r(103);

. . xtcd Adj_TobinsQ family_firm
Error: The panel is highly unbalanced.
Not enough common observations across panel to perform Pesaran's test.
insufficient observations
r(2001);

I'm at a loss for what else I can do concerning correlation across panels. Is there another way? But in any case, if there was a test that indicated cross-panel correlation, the solution would still pose some sort of trade off to correcting for serial correlation, or?

Comment

Evialina Yakimovich

Join Date: Nov 2019

Posts: 5
#6

05 Nov 2019, 13:00

Sorry, as a follow up: am I running into these issues also in part because I have a smaller T relative to N? Am I less likely to run into cross-panel issues with a smaller T and larger N?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#7

06 Nov 2019, 00:41

Evialina:
as you can see -xtreg- -xtregar- use a different number of observations (due to AR calculation in the latter).
Your panel is probably too unbalanced to check for correlation across panels (and the reason of your drawbacks are not those reported in #6).
However, since you're focusing on the -fe- estimator, the goal of your research is investigating variation within the same panel as time goes by: hence, across panel correlation may be a minor issue.
As an aside, if you're supervised by a professor/teacher/mentor discuss all these issues with her/him, in order to avoid problems as your research progresses.

Kind regards,
Carlo
(Stata 19.0)
Comment
Evialina Yakimovich

Join Date: Nov 2019

Posts: 5
#8

06 Nov 2019, 00:55

Hello Carlo, thank you for your help. Your points clarified a lot for me, and I’ll take up some of these concerns with my advisor. I appreciate your time and your contributions to the forum!

Best wishes and have a wonderful day,

Evi
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#9

06 Nov 2019, 01:11

Thanks, you too.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement