Panel data: testing for serialcorrelation and heteroskedasticity

Niels Meijer started a topic Panel data: testing for serialcorrelation and heteroskedasticity

16 Sep 2017, 05:55
Panel data: testing for serialcorrelation and heteroskedasticity

Hello,

I've got a panel data set with 200 banks, with data from 2002-2016 with varying degrees of data availability. On average there is about 8.5 years of data available per bank. I've got a dependent variable: Default risk. And several explanatory variables: Board Characteristics for each bank. Plus various control variables.

Now I was wondering how I should go about testing for serialcorrelation and heteroskedasticity. I've already read this https://www.stata.com/support/faqs/s...tocorrelation/ but that is not entirely what I am looking for as it is unclear to me what is exactly happening. I would like to manually run tests for serial correlation and heteroskedasticity.

I've done a Breusch-Godfrey test for serial correlation before but not on a panel dataset, just on time series. I this also suitable for panel data? And how would I perform this test for panel data?

Similarly, I've done a Breusch-Pagan test for heteroskedasticity before, but never on panel data, is this suitable for panel data?

Some help would be greatly appreciated, as I am new to panel data analysis.

Kind regards,

Niels
Tags: None
Martin Knipp replied

23 Feb 2021, 06:42
For those interested in a reference to cite that serial correlation is not a big issue in short panels (p.332):
Akel, V., & Torun, T. (2017). Stock market development and economic growth: the case of MSCI emerging market index countries. In Global Financial Crisis and Its Ramifications on Capital Markets (pp. 323-336). Springer, Cham.
1 like
Leave a comment:
Martin Knipp replied

21 Feb 2021, 11:33
Originally posted by Carlo Lazzaro View Post

Martin:
in the post you quoted, clustering is recommended as the number of panels (i.e., cluster) is actualy relevant, no matter the T dimension (that, along with other conditions, can have a role in increasing within-cluster correlation pattern of the systematic error). You may want to take a look at:
https://www.stata.com/meeting/wcsug07/cameronwcsug.pdf and related references (actually, I'm not aware of an article that focuses specifically on this particular topic).

Thank you, Carlo.
Here it is also reffered to this fact: http://www.princeton.edu/~otorres/Panel101.pdf on slide 36.
Unfortunately, there are no references.

Cheers
Martin
Leave a comment:
Carlo Lazzaro replied

21 Feb 2021, 07:41
Martin:
in the post you quoted, clustering is recommended as the number of panels (i.e., cluster) is actualy relevant, no matter the T dimension (that, along with other conditions, can have a role in increasing within-cluster correlation pattern of the systematic error). You may want to take a look at:
https://www.stata.com/meeting/wcsug07/cameronwcsug.pdf and related references (actually, I'm not aware of an article that focuses specifically on this particular topic).
1 like
Leave a comment:

Martin Knipp replied

21 Feb 2021, 06:32

Originally posted by Carlo Lazzaro View Post

Niels:
before switching to -xtgls- despite havibng a large N, small T dataset,, please note the dramatically different times (in seconds) taken by -xtreg- and -xtgls- to perform the same simple panel data regression:

Code:

. set rmsg on
r; t=0.00 15:48:21

. xtreg ln_wage i.race, re

Random-effects GLS regression Number of obs = 28,534
Group variable: idcode Number of groups = 4,711

R-sq: Obs per group:
within = 0.0000 min = 1
between = 0.0198 avg = 6.1
overall = 0.0186 max = 15

Wald chi2(2) = 99.02
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

------------------------------------------------------------------------------
ln_wage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race |
black | -.1300382 .013486 -9.64 0.000 -.1564702 -.1036062
other | .1011474 .0562889 1.80 0.072 -.0091768 .2114716
|
_cons | 1.691756 .0071865 235.41 0.000 1.677671 1.705841
-------------+----------------------------------------------------------------
sigma_u | .38195681
sigma_e | .32028665
rho | .58714668 (fraction of variance due to u_i)
------------------------------------------------------------------------------
r; t=0.61 15:48:28

. xtgls ln_wage i.race

Cross-sectional time-series FGLS regression

Coefficients: generalized least squares
Panels: homoskedastic
Correlation: no autocorrelation

Estimated covariances = 1 Number of obs = 28,534
Estimated autocorrelations = 0 Number of groups = 4,711
Estimated coefficients = 3 Obs per group:
min = 1
avg = 6.056888
max = 15
Wald chi2(2) = 542.80
Log likelihood = -19162 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
ln_wage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race |
black | -.1427862 .006243 -22.87 0.000 -.1550222 -.1305502
other | .080671 .0274112 2.94 0.003 .026946 .134396
|
_cons | 1.714338 .0033339 514.21 0.000 1.707804 1.720873
------------------------------------------------------------------------------
r; t=692.49 16:00:07
.

A possible work-around could be:
-skipping -xttest2- and -xttest3-;
- graphically inspect your residual distribution;
-robustify/cluster your standard errors if you suspect that (especially) heteroskedasticity can bite your results (as said, serial correlation is expected to be a minor nuisance with a short T dimension).

Otherwise, as many econometricians usually do, go -cluster-/-robust- from scratch; with 200 -panelid- you have enough clusters to survive.

I'd like to use this thread, because you are referring to it here, Carlo.
I have read this a lot recently, that serial correlation/autocorrelation is not a big problem in short panel data. Unfortunately I was not able to find a reliable (high ranked) paper giving proof to that assumption.
Carlo, would you be so kind as to cite a study if you know of one?

Thanks a lot!
Martin

Announcement

Panel data: testing for serialcorrelation and heteroskedasticity

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: