Dear All
I have been researching extensively (in here, the web and text books) the last three weeks to come up with a sound solution to the problem described below - and as a result have become more informed, but also increasingly confused as there seem to be many different solution (and perspectives on how to handle the problem) and any given solution seems to be specific to a certain situation/data set. Thus, I hope you guys can help me get on the right path here ;o)
I have a data (cross sectional) set (n ~200), which I would like to analyse using the regress command. However, when I check model assumptions heteroskedasticity appears (as a consequence of differences between genders) cf. Stata paste-in I.
Thus, I need to account for the heteroskedasticity somehow.
My research tell me, that several solution are available:
- using a model less sensitive to / taking into account heteroskedasticity
- weighting of independent variables
- transformation
I would prefer the first option, as the latter two appear to influence the data set (more or less).
In relation to the first option, I have looked into the hetregress command, as described here:
https://www.stata.com/new-in-stata/h...ar-regression/
https://www.stata.com/manuals/rhetregress.pdf
Consequently, I have tried to run that hetregress model (cf. Stata paste-in II), but I am uncertain how to check, whether using this model reduces or eliminate the effect of heteroskedasticity. The Stata manual refers to the Wald test for test of heteroskedasticity, but does not contain info in relation to interpretation (my take is that heteroskedasticity is still present).
It would be greatly appreciated, if one could tell me how to interpretate the Wald test and/or give me some hints to other (better) solutions to handle this problem (heteroskedasticity).
Thanks in advance.
Best, Sarah
Code:
Stata paste-in I . regress Measurement gender Team Source | SS df MS Number of obs = 412 -------------+---------------------------------- F(2, 409) = 189.32 Model | 723251.953 2 361625.977 Prob > F = 0.0000 Residual | 781261.969 409 1910.17596 R-squared = 0.4807 -------------+---------------------------------- Adj R-squared = 0.4782 Total | 1504513.92 411 3660.61782 Root MSE = 43.706 ------------------------------------------------------------------------------ Measurement | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- gender | -83.87553 4.310553 -19.46 0.000 -92.34913 -75.40193 Team | -.4511106 4.306488 -0.10 0.917 -8.916722 8.014501 _cons | 312.0256 9.319073 33.48 0.000 293.7063 330.3448 ------------------------------------------------------------------------------ . estat hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of Measurement chi2(1) = 39.43 Prob > chi2 = 0.0000
Code:
Stata paste-in II . hetregress Measurement gender Team, het(i.gender) twostep Heteroskedastic linear regression Number of obs = 412 Two-step GLS estimation Wald chi2(2) = 397.32 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ Measurement | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Measurement | gender | -83.87537 4.207937 -19.93 0.000 -92.12277 -75.62797 Team | .3000442 3.878994 0.08 0.938 -7.302643 7.902732 _cons | 310.9004 9.397964 33.08 0.000 292.4807 329.3201 -------------+---------------------------------------------------------------- lnsigma2 | 2.gender | -.9054411 .2190943 -4.13 0.000 -1.334858 -.4760242 _cons | 7.908204 .151501 52.20 0.000 7.611267 8.20514 ------------------------------------------------------------------------------ Wald test of lnsigma2=0: chi2(1) = 17.08 Prob > chi2 = 0.0000
Comment