Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: testing for serialcorrelation and heteroskedasticity

    Hello,

    I've got a panel data set with 200 banks, with data from 2002-2016 with varying degrees of data availability. On average there is about 8.5 years of data available per bank. I've got a dependent variable: Default risk. And several explanatory variables: Board Characteristics for each bank. Plus various control variables.

    Now I was wondering how I should go about testing for serialcorrelation and heteroskedasticity. I've already read this https://www.stata.com/support/faqs/s...tocorrelation/ but that is not entirely what I am looking for as it is unclear to me what is exactly happening. I would like to manually run tests for serial correlation and heteroskedasticity.

    I've done a Breusch-Godfrey test for serial correlation before but not on a panel dataset, just on time series. I this also suitable for panel data? And how would I perform this test for panel data?

    Similarly, I've done a Breusch-Pagan test for heteroskedasticity before, but never on panel data, is this suitable for panel data?

    Some help would be greatly appreciated, as I am new to panel data analysis.

    Kind regards,

    Niels

  • Martin Knipp
    replied
    For those interested in a reference to cite that serial correlation is not a big issue in short panels (p.332):
    Akel, V., & Torun, T. (2017). Stock market development and economic growth: the case of MSCI emerging market index countries. In Global Financial Crisis and Its Ramifications on Capital Markets (pp. 323-336). Springer, Cham.

    Leave a comment:


  • Martin Knipp
    replied
    Originally posted by Carlo Lazzaro View Post
    Martin:
    in the post you quoted, clustering is recommended as the number of panels (i.e., cluster) is actualy relevant, no matter the T dimension (that, along with other conditions, can have a role in increasing within-cluster correlation pattern of the systematic error). You may want to take a look at:
    https://www.stata.com/meeting/wcsug07/cameronwcsug.pdf and related references (actually, I'm not aware of an article that focuses specifically on this particular topic).
    Thank you, Carlo.
    Here it is also reffered to this fact: http://www.princeton.edu/~otorres/Panel101.pdf on slide 36.
    Unfortunately, there are no references.

    Cheers
    Martin

    Leave a comment:


  • Carlo Lazzaro
    replied
    Martin:
    in the post you quoted, clustering is recommended as the number of panels (i.e., cluster) is actualy relevant, no matter the T dimension (that, along with other conditions, can have a role in increasing within-cluster correlation pattern of the systematic error). You may want to take a look at:
    https://www.stata.com/meeting/wcsug07/cameronwcsug.pdf and related references (actually, I'm not aware of an article that focuses specifically on this particular topic).

    Leave a comment:


  • Martin Knipp
    replied
    Originally posted by Carlo Lazzaro View Post
    Niels:
    before switching to -xtgls- despite havibng a large N, small T dataset,, please note the dramatically different times (in seconds) taken by -xtreg- and -xtgls- to perform the same simple panel data regression:
    Code:
    . set rmsg on
    r; t=0.00 15:48:21
    
    . xtreg ln_wage i.race, re
    
    Random-effects GLS regression Number of obs = 28,534
    Group variable: idcode Number of groups = 4,711
    
    R-sq: Obs per group:
    within = 0.0000 min = 1
    between = 0.0198 avg = 6.1
    overall = 0.0186 max = 15
    
    Wald chi2(2) = 99.02
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
    
    ------------------------------------------------------------------------------
    ln_wage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    race |
    black | -.1300382 .013486 -9.64 0.000 -.1564702 -.1036062
    other | .1011474 .0562889 1.80 0.072 -.0091768 .2114716
    |
    _cons | 1.691756 .0071865 235.41 0.000 1.677671 1.705841
    -------------+----------------------------------------------------------------
    sigma_u | .38195681
    sigma_e | .32028665
    rho | .58714668 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    r; t=0.61 15:48:28
    
    . xtgls ln_wage i.race
    
    Cross-sectional time-series FGLS regression
    
    Coefficients: generalized least squares
    Panels: homoskedastic
    Correlation: no autocorrelation
    
    Estimated covariances = 1 Number of obs = 28,534
    Estimated autocorrelations = 0 Number of groups = 4,711
    Estimated coefficients = 3 Obs per group:
    min = 1
    avg = 6.056888
    max = 15
    Wald chi2(2) = 542.80
    Log likelihood = -19162 Prob > chi2 = 0.0000
    
    ------------------------------------------------------------------------------
    ln_wage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    race |
    black | -.1427862 .006243 -22.87 0.000 -.1550222 -.1305502
    other | .080671 .0274112 2.94 0.003 .026946 .134396
    |
    _cons | 1.714338 .0033339 514.21 0.000 1.707804 1.720873
    ------------------------------------------------------------------------------
    r; t=692.49 16:00:07
    .
    A possible work-around could be:
    -skipping -xttest2- and -xttest3-;
    - graphically inspect your residual distribution;
    -robustify/cluster your standard errors if you suspect that (especially) heteroskedasticity can bite your results (as said, serial correlation is expected to be a minor nuisance with a short T dimension).

    Otherwise, as many econometricians usually do, go -cluster-/-robust- from scratch; with 200 -panelid- you have enough clusters to survive.
    I'd like to use this thread, because you are referring to it here, Carlo.
    I have read this a lot recently, that serial correlation/autocorrelation is not a big problem in short panel data. Unfortunately I was not able to find a reliable (high ranked) paper giving proof to that assumption.
    Carlo, would you be so kind as to cite a study if you know of one?

    Thanks a lot!
    Martin

    Leave a comment:


  • Niels Meijer
    replied
    Thank you very much Carlo

    Leave a comment:


  • Carlo Lazzaro
    replied
    Niels:
    cross-sectional interdependence can be acccomodate via (say) the user-written programme -xtscc- (type -search xtscc- from withn Stata to install it).
    The following should help: http://www.stata-journal.com/sjpdf.h...iclenum=st0113

    Leave a comment:


  • Niels Meijer
    replied
    Thank you once again for your responses, I ended up using xtserial (Wooldridge test) and Breusch-Pagan test. Indicating both the presence of autocorrelation and heteroskedasticity. I'll run the test for cross-sectional dependence too I think. Are robust/clustered errors also able to overcome cross-sectional interdependence?

    Leave a comment:


  • Carlo Lazzaro
    replied
    Niels:
    whether the user-written programme -xtserial- is OK for testing serial correlation, the BP test that Stata offers for panel data (-xttest0-) tests random effect specification, not heteroskedasticity (however, it's true that a BP test for testing heteroskedasticity as a -regress postestimation- command is available in Stata).
    Again, you should consider -xttest2- for heteroskedasticity checking, keeping in mind that it works for -xtreg, fe- only (or -xtgls).

    PS: Crossed in cyberspace with Jesse's helpful reply, that, interestingly, shows different takes.

    Leave a comment:


  • Jesse Wursten
    replied
    I think White's is most commonly used for time series rather than panels. You could use the xtqptest command from SSC, which is a bit more flexible and powerful than xtserial. If you like testing, you can also run xtcdf to check for cross-sectional dependence.

    Leave a comment:


  • Niels Meijer
    replied
    Carlo and River, thank you very much for your replies. They are very helpful.

    As this is for my thesis, the point is also to show that I've run the tests and show the results, not just for deciding whether I should use robust options. So how do I run the test after running the xtgls regression? It is unclear to me what's happening exactly with the xtgls command.

    I'd like to run a Breusch-Pagan test for heteroskedasticity as I said, as my econometrics instructor told me I can use this for panel data aswell. Is just using the -regress- and than -hettest- commands okay?

    Also, for serial correlation (I know you said this is probably not a problem, but I'd like to include it in my thesis anyway) I've run the -xtserial- command which runs the Wooldridge (2002) test for serial correlation, is this ok? Also, I'd like to know if it is possible to use the White test, as a friend recommended it to me.

    The aforementioned tests indicate that there is both serial correlation and heteroskedasticity in my data, thus this would lead me to use the robust option.
    Last edited by Niels Meijer; 18 Sep 2017, 04:04.

    Leave a comment:


  • River Huang
    replied
    Carlo has the right answer.

    Leave a comment:


  • Carlo Lazzaro
    replied
    Erol:
    please note that under -xtreg- -robust- and -cluster- options do the same job.
    That feature does not apply to -regress-, where -robust- and -cluster- options are totally different beasts.

    Leave a comment:


  • Erol Egemen
    replied
    Originally posted by River Huang View Post
    In finance journals, you can find that the "common" way to deal with serial correlation and heteroskedasticity is to (directly) using "clustered standard errors".
    Do you mean using "robust" option for xtreg?

    Leave a comment:


  • River Huang
    replied
    In finance journals, you can find that the "common" way to deal with serial correlation and heteroskedasticity is to (directly) using "clustered standard errors".

    Leave a comment:

Working...
X