Hello everybody,
I am trying to replicate a study about credit ratings. Therefore I need ordered logit in Stata18.
But I do not receive plausible results and need the help of the community to spot my mistakes.
In the following I am trying to present you my currrent progress. All my problems are explained in the process.
First, a short summary of my dataset here:
All explainatory variables are standarized, as my replication study does. Moody_Ordinal2 is the explained variable with 17 categories, where the highst score (double) illustrats the best rating.
Firstly I started with an ologit regession:
Followed by the supsequent Brant test:
As you can see, each coeffient violates the parallel regression assumption.
Can somebody explain me why the p value is equal to zero for all explainatory variables?
This is followed by a gologit2 estimation that currently not running as hoped. It applied autofit(0.1) to find the all parameters where the parallel regression assumption is violated.
This seems to be considerable in relation to the 2,112 observations.
Does it maybe have to do with previous mistakes from my side?
Could you please suggest what, I could improve to decrease the number of negative probability cases?
I already tried to apply the help of the troubleshooting blog of Mr. Williams.
Please let me know, if you need any further information or calculations.
Thanks to all of you very much in advance!
Best regards
Simon
I am trying to replicate a study about credit ratings. Therefore I need ordered logit in Stata18.
But I do not receive plausible results and need the help of the community to spot my mistakes.
In the following I am trying to present you my currrent progress. All my problems are explained in the process.
First, a short summary of my dataset here:
Code:
sum Moody_Ordinal2 vul_std read_std unemp_std gdpg_std inf_std cab_std nbot_std Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- Moody_Ordi~2 | 2,373 8.903498 5.150636 1 17 vul_std | 3,051 -.0476733 .9984426 -2.125193 2.611845 read_std | 3,078 .0477241 1.007052 -2.226284 2.830722 unemp_std | 3,051 -.0525734 .9697702 -1.36213 5.270628 gdpg_std | 3,061 -.0168634 .9593024 -5.699496 17.91622 -------------+--------------------------------------------------------- inf_std | 3,058 -.0043643 1.045537 -.3648588 45.5398 cab_std | 2,886 .0268091 .9834255 -5.745483 5.688852 nbot_std | 2,699 .0021063 .9775614 -3.534807 8.74639
Firstly I started with an ologit regession:
Code:
ologit Moody_Ordinal2 vul_std read_std unemp_std gdpg_std inf_std cab_std nbot_std, robust Iteration 0: Log pseudolikelihood = -5803.7668 Iteration 1: Log pseudolikelihood = -4736.8221 Iteration 2: Log pseudolikelihood = -4589.5218 Iteration 3: Log pseudolikelihood = -4584.2746 Iteration 4: Log pseudolikelihood = -4584.2609 Iteration 5: Log pseudolikelihood = -4584.2609 Ordered logistic regression Number of obs = 2,112 Wald chi2(7) = 1715.37 Prob > chi2 = 0.0000 Log pseudolikelihood = -4584.2609 Pseudo R2 = 0.2101 -------------------------------------------------------------------------------- | Robust Moody_Ordinal2 | Coefficient std. err. z P>|z| [95% conf. interval] ---------------+---------------------------------------------------------------- vul_std | -.5664172 .0704015 -8.05 0.000 -.7044017 -.4284328 read_std | 1.966219 .0734044 26.79 0.000 1.822349 2.110089 unemp_std | -.3062116 .0714999 -4.28 0.000 -.4463488 -.1660745 gdpg_std | .2486259 .0521124 4.77 0.000 .1464875 .3507643 inf_std | -6.96565 .6329065 -11.01 0.000 -8.206124 -5.725176 cab_std | .5087523 .0405034 12.56 0.000 .4293672 .5881374 nbot_std | .0317213 .0470548 0.67 0.500 -.0605044 .1239469 ---------------+---------------------------------------------------------------- /cut1 | -3.863205 .1295718 -4.117161 -3.609249 /cut2 | -2.816824 .1041915 -3.021035 -2.612612 /cut3 | -2.152922 .0884819 -2.326344 -1.979501 /cut4 | -1.341813 .0794289 -1.49749 -1.186135 /cut5 | -.8753156 .0781837 -1.028553 -.7220783 /cut6 | -.4428108 .0759965 -.5917613 -.2938603 /cut7 | .1375982 .0769733 -.0132668 .2884632 /cut8 | .8270446 .0771258 .6758808 .9782084 /cut9 | 1.368512 .0790759 1.213526 1.523498 /cut10 | 1.857819 .0861269 1.689013 2.026624 /cut11 | 2.315379 .0923913 2.134295 2.496463 /cut12 | 2.93792 .0931986 2.755254 3.120586 /cut13 | 3.661002 .1049213 3.45536 3.866644 /cut14 | 4.077359 .1126143 3.856638 4.298079 /cut15 | 4.596146 .1190795 4.362755 4.829538 /cut16 | 4.942797 .1274074 4.693083 5.192511
Code:
brant Brant test of parallel regression assumption | chi2 p>chi2 df -------------+------------------------------ All | 859.46 0.000 105 -------------+------------------------------ vul_std | 151.79 0.000 15 read_std | 118.19 0.000 15 unemp_std | 148.73 0.000 15 gdpg_std | 70.07 0.000 15 inf_std | 64.24 0.000 15 cab_std | 60.02 0.000 15 nbot_std | 89.42 0.000 15 A significant test statistic provides evidence that the parallel regression assumption has been violated.
Can somebody explain me why the p value is equal to zero for all explainatory variables?
This is followed by a gologit2 estimation that currently not running as hoped. It applied autofit(0.1) to find the all parameters where the parallel regression assumption is violated.
Code:
gologit2 Moody_Ordinal2 vul_std read_std unemp_std gdpg_std inf_std cab_std nbot_std, vce(robust) autofit(0.1) ----------------------------------------------------------------------------- Testing parallel lines assumption using the .1 level of significance... Step 1: Constraints for parallel lines are not imposed for vul_std (P Value = 0.00000) read_std (P Value = 0.00000) unemp_std (P Value = 0.00051) gdpg_std (P Value = 0.00000) inf_std (P Value = 0.00000) cab_std (P Value = 0.00000) nbot_std (P Value = 0.00000) ------------------------------------------------------------------------------ Generalized Ordered Logit Estimates Number of obs = 2,112 Wald chi2(112) = 2315.83 Prob > chi2 = 0.0000 Log pseudolikelihood = -4262.9912 Pseudo R2 = 0.2655 [...] In the end, I get this warning: WARNING! 2071 in-sample cases have an outcome with a predicted probability that is less than 0. See the gologit2 help section on Warning Messages for more information.
This seems to be considerable in relation to the 2,112 observations.
Does it maybe have to do with previous mistakes from my side?
Could you please suggest what, I could improve to decrease the number of negative probability cases?
I already tried to apply the help of the troubleshooting blog of Mr. Williams.
Please let me know, if you need any further information or calculations.
Thanks to all of you very much in advance!
Best regards
Simon