Diagnostics test in Panel Data Analysis

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#16

05 Nov 2021, 08:01

Tamirlan:
let's stick with your original post (on which Fei and George already commented positively):
1) normality test: normality is a (weak) requirement for residual distribution only;
2) multicollinearity test: if perfect, Stata fixes it by default; if quasi-extreme (and you are confident that the data generating process is well represented in terms of predictors and/or interactions), see the humorous Chapter 23 in https://www.hup.harvard.edu/catalog....=9780674175440. More seriously, quasi-extreme multicollinearity (that is often the stalking horse under which misspecification hides) may be an issue when you detect "weird" standard errors;
3) heteroskedasticity test: there's no gain in running this text when you have invoked non-default standard errors (and the same holds when you detect autocorrelation). In addition, I would not sponsor running -hausman- with default standard errors (being aware of heteroskedastcity and/or autocorrelation) and then replace them with theri non-default counterparts.

Kind regards,
Carlo
(Stata 19.0)
Comment
George Ford

Join Date: Aug 2014

Posts: 3121
#17

05 Nov 2021, 08:15

I see. If ESG is a linear combination of the the other three, then you can't run them all together (Stata will dump something because of the collinearity). The difference in the two models might be interesting, since ESG may be insignificant while one or more of the three may not, or all the variables may be significant but of different impact. I'd check the scale of the individual components as they may be inconsistent, leading to different sized coefficients. You might standardize them first: egen GOVs = std(GOV). Or use margins to test for a 1sd change and compute the %change.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#18

05 Nov 2021, 08:19

Temirlan:
George already warned you about going that way (and wisely so).
1) If your -ESG- is correlated with E, S and G score, you will probably face collinearity issues (since, when adjusted for each other, the predictors partially overap).
Moreover, if you do three separate regressions, you're implicitly stating that variation in the regressand are explained, in turn, by three different predictors (and this approach t does not give and fair true view of the data generating process you're investigating).
Hence, I would include all the three scores in the righ-hand side of a single regression (and forget about ESG).
2) issues about -hausman-. This test works well asymptotically and, as such, can be as fragile as a crystal sometimes. Some researchers do not consider -hausman- verdict as written in the stone (I think thais was Fei's take). In addition, the no correlation assumption of the panel-wise effect (u) with the vector of regressors if often too weak to hold with -re- specification.
That said, your methodological sequence is correct (first -fe-, then -re.; eventually, comparing the two specifications via -hausman-) as long as you use default standard errors.
In this is not the case, you should compare the -re- specification via the community-contributed module -xtoverid- (if the null is rejected, go -fe-).

Kind regards,
Carlo
(Stata 19.0)
Comment
George Ford

Join Date: Aug 2014

Posts: 3121
#19

05 Nov 2021, 09:03

Carlo sums it up nicely, but I suspect your professor is demanding you run a bunch of tests even if they are unnecessary. That's fine. You'll learn something. Run them, report them.

As for normality, it does not matter. There are many citations, but here are a couple I found:
Sawilowsky SS, Blair RC. 1992. A more realistic look at the robustness and type II error properties of the t test to departures from population normality. Psychol. Bull. 111:352–60.
Sawilowsky SS, Hillman SB. 1993. Power of the independent samples t test under a prevalent psychometric measure distribution. J. Consult. Clin. Psychol. 60:240– 43.

Stata has nice normality plots (https://campusguides.lib.utah.edu/c....0853&p=1054157). Normality is almost always rejected in larger samples, so the plots are an alternative.

Cluster the standard errors on the cross section--that deals with autocorr and hetero. Done!

So your model is:

Code:

* I label the cross section identifier as id. Just substitute. *I label the year variable year. Just substitute. * Hausman xtreg ROE ESG, re cluster(id) xtoverid * Model 1 reghdfe ROE ESG, absorb(time id) cluster(id) predict r, resid swilk r pnorm r qnorm r * Model 2 reghdfe ROE ENV SOC GOV, absorb(time id) cluster(id) resid swilk r pnorm r qnorm r *And if the scaling of the components is inconsistent, then egen ENVs = std(ENV) egen SOCs = std(SOC) egen GOV = std(GOV) reghdfe ROE ENVs SOCs GOVs, absorb(time id) cluster(id)
1 like
Comment
Temirlan Talap

Join Date: Nov 2021

Posts: 19
#20

05 Nov 2021, 15:12

Thank you for your kindly discussion George and Carlo. I really appreciate your help.

One last question, if Hausman test identified to use random effect model. What will be command for that to control autocorr and hetero?
Comment
George Ford

Join Date: Aug 2014

Posts: 3121
#21

06 Nov 2021, 18:42

Cluster handled the autoc and hetero issues in the regression line.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#22

07 Nov 2021, 01:35

Temirlan:
as already warned about, you should impose the -vce(cluster panelid)- option before testing -fe- vs -re- specification. Then you should test the null that -re- is the way to go via the community-contributed module-xtoverid- (because -hausman- does not support non-default standard errors).
As an aside, please note that under -xtreg- (unlike -regress-) both -vce(cluster panelid)- and -robust- options do the very same job (i.e., taking heteroskedasticity and/or autocorrelation into account).

Kind regards,
Carlo
(Stata 19.0)
Comment
Temirlan Talap

Join Date: Nov 2021

Posts: 19
#23

07 Nov 2021, 12:22

Dear George and Carlo, thank you for your help.

I got some interesting results when I was performing regression for -fe and -re with clusters to choose between models using xtoverid command.

When I did xtreg -fe regression, the Prob > F showed 0.5896 which I is not significant and the model can be wrong. While xtreg -re regression showed that Prob > chi2 is 0.0282 which is significant at 5 percent. But when I do xtoverid to decide which model to use, the P-value is 0.0002 which indicates to use fixed effect model.

Is it correct to use fixed-effect model in such a case? What is the right step in this scenario?

Kind regards,
Temirlan.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#24

07 Nov 2021, 13:56

Tamirlan:
could you please share via CODE delimiters what you typed and what Stata gave you back (as per FAQ)? Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Temirlan Talap

Join Date: Nov 2021

Posts: 19
#25

07 Nov 2021, 15:34

Dear Carlo, please see the following codes:

xtreg ROE ESG_total Leverage NL GDP, fe cluster (ID)

Fixed-effects (within) regression Number of obs = 459
Group variable: ID Number of groups = 51

R-squared: Obs per group:
Within = 0.0292 min = 9
Between = 0.2481 avg = 9.0
Overall = 0.1949 max = 9

F(4,50) = 2.53
corr(u_i, Xb) = 0.1873 Prob > F = 0.0519

xtreg ROE ESG_total Leverage NL GDP, re cluster (ID)

Random-effects GLS regression Number of obs = 459
Group variable: ID Number of groups = 51

R-squared: Obs per group:
Within = 0.0252 min = 9
Between = 0.3172 avg = 9.0
Overall = 0.2460 max = 9

Wald chi2(4) = 24.77
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0001

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(ID)
Sargan-Hansen statistic 19.309 Chi-sq(4) P-value = 0.0007

.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#26

08 Nov 2021, 01:27

Temirlan:
to increase your chances to get (more) helpful replies, for the future please post all the relevant pieces of information of your Stata session (the codes you typed + the outcome tables). Thanks.
That said:
1) -fe- specification: it may well be that you have many time-invariant predictors that the -fe- estimator wipes out. That may be the reason why you end up with a barely non-signifcant P-value for the F-test. The F-test is not concerning in itself, but it may warn you about the absence of evidence of a panel-wise effect. By the way, if you go with default standard errors, what does the F-test appearing as a footnote under the -xtreg,fe- outcome table tell you?
Otherwise, it may be that you have a miispecified model;
2) -re- specification: what does the -xttest0- after -xtreg,re- give give you back?

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment