Generalized Ordered logit for credit ratings

Simon Setz

Join Date: Feb 2024
Posts: 1

Generalized Ordered logit for credit ratings

12 Feb 2024, 15:21

Hello everybody,

I am trying to replicate a study about credit ratings. Therefore I need ordered logit in Stata18.
But I do not receive plausible results and need the help of the community to spot my mistakes.

In the following I am trying to present you my currrent progress. All my problems are explained in the process.

First, a short summary of my dataset here:

Code:

sum Moody_Ordinal2 vul_std read_std unemp_std gdpg_std inf_std cab_std nbot_std


Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
Moody_Ordi~2 | 2,373 8.903498 5.150636 1 17
vul_std | 3,051 -.0476733 .9984426 -2.125193 2.611845
read_std | 3,078 .0477241 1.007052 -2.226284 2.830722
unemp_std | 3,051 -.0525734 .9697702 -1.36213 5.270628
gdpg_std | 3,061 -.0168634 .9593024 -5.699496 17.91622
-------------+---------------------------------------------------------
inf_std | 3,058 -.0043643 1.045537 -.3648588 45.5398
cab_std | 2,886 .0268091 .9834255 -5.745483 5.688852
nbot_std | 2,699 .0021063 .9775614 -3.534807 8.74639

All explainatory variables are standarized, as my replication study does. Moody_Ordinal2 is the explained variable with 17 categories, where the highst score (double) illustrats the best rating.

Firstly I started with an ologit regession:

Code:

ologit Moody_Ordinal2 vul_std read_std unemp_std gdpg_std inf_std cab_std nbot_std, robust

Iteration 0: Log pseudolikelihood = -5803.7668
Iteration 1: Log pseudolikelihood = -4736.8221
Iteration 2: Log pseudolikelihood = -4589.5218
Iteration 3: Log pseudolikelihood = -4584.2746
Iteration 4: Log pseudolikelihood = -4584.2609
Iteration 5: Log pseudolikelihood = -4584.2609

Ordered logistic regression Number of obs = 2,112
Wald chi2(7) = 1715.37
Prob > chi2 = 0.0000
Log pseudolikelihood = -4584.2609 Pseudo R2 = 0.2101

--------------------------------------------------------------------------------
| Robust
Moody_Ordinal2 | Coefficient std. err. z P>|z| [95% conf. interval]
---------------+----------------------------------------------------------------
vul_std | -.5664172 .0704015 -8.05 0.000 -.7044017 -.4284328
read_std | 1.966219 .0734044 26.79 0.000 1.822349 2.110089
unemp_std | -.3062116 .0714999 -4.28 0.000 -.4463488 -.1660745
gdpg_std | .2486259 .0521124 4.77 0.000 .1464875 .3507643
inf_std | -6.96565 .6329065 -11.01 0.000 -8.206124 -5.725176
cab_std | .5087523 .0405034 12.56 0.000 .4293672 .5881374
nbot_std | .0317213 .0470548 0.67 0.500 -.0605044 .1239469
---------------+----------------------------------------------------------------
/cut1 | -3.863205 .1295718 -4.117161 -3.609249
/cut2 | -2.816824 .1041915 -3.021035 -2.612612
/cut3 | -2.152922 .0884819 -2.326344 -1.979501
/cut4 | -1.341813 .0794289 -1.49749 -1.186135
/cut5 | -.8753156 .0781837 -1.028553 -.7220783
/cut6 | -.4428108 .0759965 -.5917613 -.2938603
/cut7 | .1375982 .0769733 -.0132668 .2884632
/cut8 | .8270446 .0771258 .6758808 .9782084
/cut9 | 1.368512 .0790759 1.213526 1.523498
/cut10 | 1.857819 .0861269 1.689013 2.026624
/cut11 | 2.315379 .0923913 2.134295 2.496463
/cut12 | 2.93792 .0931986 2.755254 3.120586
/cut13 | 3.661002 .1049213 3.45536 3.866644
/cut14 | 4.077359 .1126143 3.856638 4.298079
/cut15 | 4.596146 .1190795 4.362755 4.829538
/cut16 | 4.942797 .1274074 4.693083 5.192511

Followed by the supsequent Brant test:

Code:

brant

Brant test of parallel regression assumption

| chi2 p>chi2 df
-------------+------------------------------
All | 859.46 0.000 105
-------------+------------------------------
vul_std | 151.79 0.000 15
read_std | 118.19 0.000 15
unemp_std | 148.73 0.000 15
gdpg_std | 70.07 0.000 15
inf_std | 64.24 0.000 15
cab_std | 60.02 0.000 15
nbot_std | 89.42 0.000 15

A significant test statistic provides evidence that the parallel
regression assumption has been violated.

As you can see, each coeffient violates the parallel regression assumption.
Can somebody explain me why the p value is equal to zero for all explainatory variables?

This is followed by a gologit2 estimation that currently not running as hoped. It applied autofit(0.1) to find the all parameters where the parallel regression assumption is violated.

Code:

gologit2 Moody_Ordinal2 vul_std read_std unemp_std gdpg_std inf_std cab_std nbot_std, vce(robust) autofit(0.1)

-----------------------------------------------------------------------------
Testing parallel lines assumption using the .1 level of significance...

Step 1: Constraints for parallel lines are not imposed for
vul_std (P Value = 0.00000)
read_std (P Value = 0.00000)
unemp_std (P Value = 0.00051)
gdpg_std (P Value = 0.00000)
inf_std (P Value = 0.00000)
cab_std (P Value = 0.00000)
nbot_std (P Value = 0.00000)


------------------------------------------------------------------------------

Generalized Ordered Logit Estimates Number of obs = 2,112
Wald chi2(112) = 2315.83
Prob > chi2 = 0.0000
Log pseudolikelihood = -4262.9912 Pseudo R2 = 0.2655


[...]

In the end, I get this warning:
WARNING! 2071 in-sample cases have an outcome with a predicted probability that is
less than 0. See the gologit2 help section on Warning Messages for more information.

This seems to be considerable in relation to the 2,112 observations.
Does it maybe have to do with previous mistakes from my side?
Could you please suggest what, I could improve to decrease the number of negative probability cases?
I already tried to apply the help of the troubleshooting blog of Mr. Williams.

Please let me know, if you need any further information or calculations.

Thanks to all of you very much in advance!

Best regards

Simon

Tags: brant, generalized ordered logit, gologit2, ordered logit, parallel reg assump.

Announcement

Generalized Ordered logit for credit ratings