Originally posted by Carlo Lazzaro
View Post
Deborah:
you should get yourself more familiar with the Statalist rules.
That said, elaborating on your .dta file (please, learn how to share data example/excerpt via -dataex- with no risk of downloading active contencts at the repliers' side), you can go as follows:
Once ln-transformed, while the number of your observations drops (those <0 cannot be logged), your regression looks technically speaking fine (heteroskedasticity was accounted for via -robust- standard errors).
That said, please note that:
- now you have a log-linear regression (see coefficients intepretation in any decent econometrics textbook);
- your R-sq is not that sky-rocketing. This might be due to the lack of non-categorical predictors in the right-hand side of your regression equation;
- get yourself familiar with -estat hettest-; -estat ovtest- and -linktest- postestimation commands by reading the related Stata .pdf manual entries.
you should get yourself more familiar with the Statalist rules.
That said, elaborating on your .dta file (please, learn how to share data example/excerpt via -dataex- with no risk of downloading active contencts at the repliers' side), you can go as follows:
Code:
. gen ln_DiffMeanHourlyPercent=ln( DiffMeanHourlyPercent) . regress ln_DiffMeanHourlyPercent i.RegionCode i.IndustrySectorCode i.EmployerSizecode, robust Linear regression Number of obs = 6,256 F(34, 6221) = 26.05 Prob > F = 0.0000 R-squared = 0.1204 Root MSE = .92171 ------------------------------------------------------------------------------------ | Robust ln_DiffMeanHourl~t | Coefficient std. err. t P>|t| [95% conf. interval] -------------------+---------------------------------------------------------------- RegionCode | 2 | .1719134 .0718272 2.39 0.017 .0311073 .3127196 3 | .2133954 .0610139 3.50 0.000 .093787 .3330037 4 | .0436778 .0749884 0.58 0.560 -.1033254 .1906811 5 | .1382207 .0667634 2.07 0.038 .0073414 .2690999 6 | -.0827402 .1919834 -0.43 0.667 -.4590941 .2936136 7 | .0892374 .078761 1.13 0.257 -.0651614 .2436362 8 | .2251707 .0653417 3.45 0.001 .0970784 .353263 9 | .1172461 .0738838 1.59 0.113 -.0275916 .2620838 10 | .0058935 .0905899 0.07 0.948 -.171694 .183481 11 | .0947933 .0704681 1.35 0.179 -.0433486 .2329352 | IndustrySectorCode | 2 | .3597039 .2165566 1.66 0.097 -.0648219 .7842297 3 | -.0439148 .1833722 -0.24 0.811 -.4033876 .315558 4 | .298753 .2023624 1.48 0.140 -.0979472 .6954532 5 | -.2177145 .235142 -0.93 0.355 -.678674 .243245 6 | .5294608 .1857151 2.85 0.004 .165395 .8935266 7 | .1806306 .1840547 0.98 0.326 -.1801801 .5414413 8 | -.1691013 .1900056 -0.89 0.374 -.5415779 .2033753 9 | -.6617052 .1918709 -3.45 0.001 -1.037838 -.285572 10 | .2723101 .1849734 1.47 0.141 -.0903016 .6349219 11 | .6965266 .1845571 3.77 0.000 .3347309 1.058322 12 | .3836583 .2072682 1.85 0.064 -.0226589 .7899755 13 | .3344334 .1844582 1.81 0.070 -.0271683 .6960351 14 | .0391356 .1850471 0.21 0.833 -.3236205 .4018917 15 | -.3197152 .2994744 -1.07 0.286 -.9067885 .267358 16 | .0697266 .1901965 0.37 0.714 -.3031243 .4425775 17 | -.3545895 .1944688 -1.82 0.068 -.7358154 .0266364 18 | .6443213 .2082458 3.09 0.002 .2360877 1.052555 19 | .0026552 .1959132 0.01 0.989 -.3814023 .3867127 20 | -.055708 .2865058 -0.19 0.846 -.6173583 .5059423 | EmployerSizecode | 2 | -.0706938 .0715729 -0.99 0.323 -.2110015 .0696138 3 | -.0769075 .0727363 -1.06 0.290 -.2194957 .0656808 4 | -.1846449 .0736806 -2.51 0.012 -.3290842 -.0402055 5 | -.218186 .090523 -2.41 0.016 -.3956423 -.0407296 6 | -.2457864 .1263792 -1.94 0.052 -.4935333 .0019605 | _cons | 2.297391 .1989316 11.55 0.000 1.907416 2.687366 ------------------------------------------------------------------------------------ . estat ovtest Ramsey RESET test for omitted variables Omitted: Powers of fitted values of ln_DiffMeanHourlyPercent H0: Model has no omitted variables F(3, 6218) = 2.35 Prob > F = 0.0706 . linktest Source | SS df MS Number of obs = 6,256 -------------+---------------------------------- F(2, 6253) = 428.27 Model | 723.908211 2 361.954106 Prob > F = 0.0000 Residual | 5284.73535 6,253 .845151984 R-squared = 0.1205 -------------+---------------------------------- Adj R-squared = 0.1202 Total | 6008.64357 6,255 .960614479 Root MSE = .91932 ------------------------------------------------------------------------------ ln_DiffMea~t | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- _hat | 1.210942 .3420908 3.54 0.000 .540327 1.881558 _hatsq | -.0437125 .0705351 -0.62 0.535 -.1819856 .0945605 _cons | -.2493378 .4111786 -0.61 0.544 -1.055389 .5567135 ------------------------------------------------------------------------------
That said, please note that:
- now you have a log-linear regression (see coefficients intepretation in any decent econometrics textbook);
- your R-sq is not that sky-rocketing. This might be due to the lack of non-categorical predictors in the right-hand side of your regression equation;
- get yourself familiar with -estat hettest-; -estat ovtest- and -linktest- postestimation commands by reading the related Stata .pdf manual entries.
Comment