Proportional odds assumption in order logit regression: how to test multiple parameters simultaneously

Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#1

Proportional odds assumption in order logit regression: how to test multiple parameters simultaneously

03 Dec 2019, 06:17

Dear Statalisters,

I'm using Stata 15.1.

I have an ordinal outcome that may assume 5 distinct values. Among my predictors, I have some variables representing percentages summing up to 100. For example, I have 5 age groups that are mutually exclusive and include all possible age values. I know I can use the "autofit" option of the gologit2 command to test which parameters fulfill the proportional odds assumption at a given significance value. Nevertheless, I'm afraid that, with linearly depend variables as age-group rates, results would change basing on the reference category I select. Is there a way to avoid such arbitrariness?

First of all, is there a way for Stata to perform it automatically (with gologit2, oglm or any other command)?

Otherwise, I thought about replicating the "gologit2, autofit" behaviour manually, by comparing separate models (i.e., a reference model M* vs a model only differing from M* in the proportional odds assumptions for a single variable). Do I well understand that gologit2 is basically using a stepwise procedure, i.e. starting with a no-constraint model and then adding one-by-one the constraints with the highest (i.e., closer to 1) p-value, until tests for proportional odds are significant for all parameters?
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4944
#2

03 Dec 2019, 07:18

If I ever get ambitious, I will modify gologit2 to test factor variables as a block, e.g. something like

gologit2 y x1 x2 (i.x3), autofit

For now, you can manually impose whatever constraints you want. Perhaps start with an autofitted model and then modify it to impose or not impose constraints on all the dummies for a factor variable.

Your description of the autofit procedure is correct. See section 3.1 of

https://journals.sagepub.com/doi/pdf...867X0600600104

Incidentally, if I had my life to live over, the default for autofit would be .01 or I would build in something like a Bonferroni test to take into account that multiple tests are being done.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#3

03 Dec 2019, 14:02

Originally posted by Richard Williams View Post

If I ever get ambitious, I will modify gologit2 to test factor variables as a block, e.g. something like

gologit2 y x1 x2 (i.x3), autofit

Yes: this is like the issue that I'm facing, but at the individual level (put in another way, what I'm doing is the population-version of factor variables). Thus what I'm looking for is something like:
gologit2 y x1 x2 (x3 x4 x5), autofit

Originally posted by Richard Williams

For now, you can manually impose whatever constraints you want. Perhaps start with an autofitted model and then modify it to impose or not impose constraints on all the dummies for a factor variable.

Your description of the autofit procedure is correct. See section 3.1 of

https://journals.sagepub.com/doi/pdf...867X0600600104

Incidentally, if I had my life to live over, the default for autofit would be .01 or I would build in something like a Bonferroni test to take into account that multiple tests are being done.

Thank you. However, it seems to me one either has to leave the selection totally to “autofit”, or run models where s/he decides which variables are allowed to not respect the parallel lines assumptions and which are forced to have proportional odds (by using the pl or the npl option appropriately).

Originally posted by Richard Williams

Incidentally, if I had my life to live over, the default for autofit would be .01 or I would build in something like a Bonferroni test to take into account that multiple tests are being done.

EDITED PART SINCE I HAD CONFUSED CONSTRAINED AND UNCONSTRAINED VARIABLES:

Wouldn’t it be better to perform a global test on the constrained variables and refuse constraining for the variable making the global test statistically significant? I find “omnibus” tests much better than Bonferroni-like corrections, or an arbitrary decrease of the significance threshold. I think the relevance of the global test is acknowledged in “gologit2, autofit” as well, as long as it performs a Wald test on the whole set of constraints. I personally see possible significant tests for single variables in case of nonsignificance of the global Wald test in a similar way to significance of some parameter estimates in a regression where the global F-test is statistically nonsignificant.

Last edited by Federico Tedeschi; 03 Dec 2019, 14:59.
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#4

03 Dec 2019, 15:15

Originally posted by Federico Tedeschi View Post

I personally see possible significant tests for single variables in case of nonsignificance of the global Wald test in a similar way to significance of some parameter estimates in a regression where the global F-test is nonsignificant.

On second though, my statement doesn't take into account the fact that a stepwise procedure is performed.
The proper comparison would be with a backward regression. Thus, my idea would be as saying: "continue removing variables even if their parameter estimates are statistically significant, as long as the test on the whole set of removed variables remains nonsignificant". I understand this may be questionable. An alternative could be the comparison of results from the backward procedure with the ones from a forward procedure, i.e. a one starting with the ordinary logit model and removing proportional odds constraints one by one in case of statistical significance.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4944
#5

03 Dec 2019, 19:46

Again, you can do whatever you want for imposing or not imposing constraints if you use the pl or npl options. It would be ideal if you had some great theory of when proportional odds should or should not hold, but most people have no such theory, which is why they use autofit.

I have often described autofit as the lesser of three evils. You could use the proportional odds model, knowing that its assumptions are violated. Or, you could use mlogit, which may estimate far more parameters than is necessary. Or, you can use autofit, knowing that assumptions will not be violated in this case and that you won't be estimating a bunch of unnecessary parameters.

And, if autofit still makes you queasy, you can use stricter P values, which reduces the likelihood that non-violations of proportional odds will show up as significant just by chance. Or, you could get really rigorous, developing a model with one data set and then seeing if it works ok with another.

Also, I don't think autofit is really any worse than a fractional polynomial regression where you let the program pick the best powers.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#6

04 Dec 2019, 03:02

Originally posted by Richard Williams View Post

Again, you can do whatever you want for imposing or not imposing constraints if you use the pl or npl options.
[…]
Or, you could get really rigorous, developing a model with one data set and then seeing if it works ok with another.

Yes. The problem is that I have 27 parameters related to the regressors (I mean: excluding cutoffs) in the ordinal logit model. Considering linearly dependent parameters as if they belonged to the same variable, I have 15 regressors. Since I have to choose whether the parallel lines assumption holds for each one, I have 2^15=32,768 combinations.

Also, I don't think autofit is really any worse than a fractional polynomial regression where you let the program pick the best powers.

1) Yes, your analogy sounds better than mine: when the issue is whether or not including one covariate in a regression, one may prefer to include it even if not significant, since we want to control for it. But whether or not assuming parallel lines is an issue about a covariate that we are already including, thus the fractional polynomial case is more similar (at least if we put the constraint that the variable has to be included in the model anyway). And, in cases where variables are already included, maybe we want to be more parsimonious.

2) I have to confess, however, that in my case I was considering “gologit2” just to select which variables had to be included by assuming proportional odds and which not. For my final model, I had thought of “oglm” with the "hetero" option, thus at a heterogeneous choice model, and the question was: which variables should I include in the list of determinants of heteroscedasticity? In that case then variables for which we don’t assume proportional odds only contribute for one extra-parameter (the scale/heterogeneity one), and one may prefer to include a variable in the list to be more robust.

3) In the case of choosing the set of covariates, Stata is able to compare all possible models with a given number of parameters through the Furnival-Wilson leaps-and-bounds algorithm (vselect and gvselect) and, in a Bayesian perspective, to give you the inclusion probability of each variable (miinc). I would need something like that for the parallel lines assumption.

4) I think I'm missing some theory, since it is not clear to me how residuals are defined in the case of ordered logistic regression, but I guess the choice of which variables to include among determinants of heteroscedasticity should be based on them.

5) For the moment, I think what I’m going to do is a backward stepwise approach by running the “oglm” command and A) using “lnsigma” p-values for “single” variables, B) performing a likelihood ratio test for linearly dependent ones.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4944
#7

04 Dec 2019, 14:00

I'm not clear on why you are so reluctant to just use autofit. Unless you have a great theory it may be hard and tedious to work out something on your own, and you still run the risk of capitalizing on chance.

But, if you are greatly concerned, maybe you should just use mlogit. I find mlogit parameters tedious to interpret. But, maybe you can just focus on the marginal effects and adjusted predictions. See, for example,

https://www3.nd.edu/~rwilliam/stats3/Margins05.pdf

oglm is sometimes worth considering as an alternation to gologit. See

https://www.stata-journal.com/articl...article=st0208

See especially section 4.3.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#8

05 Dec 2019, 02:30

Originally posted by Richard Williams View Post

I'm not clear on why you are so reluctant to just use autofit.

Because I don't want results to depend on the choice of the reference variables (i.e., the variables not to include in the regression in order to guarantee identifiability). Let me tell you about one of these variable-clusters: age. I have 4 variables: rate_18_34, rate_35_44, rate_45_54 and rate_55_64. These variables are percentages, and that creates one constraint (obviously, that their sum is 100). Thus, suppose that I choose "rate_18_34" as the variable to omit from regression to guarantee identification. Then, I'd like to tell "autofit": "Listen, use whatever procedure you like, but rate_35_44, rate_45_54 and rate_55_64 must either all or none be constrained to have proportional odds". Otherwise I think my results would change if I switched my omitted variable to, let's say, rate_35_44.

But, if you are greatly concerned, maybe you should just use mlogit. I find mlogit parameters tedious to interpret. But, maybe you can just focus on the marginal effects and adjusted predictions. See, for example,

https://www3.nd.edu/~rwilliam/stats3/Margins05.pdf

Thank you. in this case however the number of parameters would "explode" (from 27 to 108). I think analyses on margins are interesting when you focus on one predictor or two: in my case I have 15, all considered as interesting, thus I'd prefer to just have one parameter per variable.

oglm is sometimes worth considering as an alternation to gologit.

See especially section 4.3.

Yes: this is precisely what I'm using, with the "hetero" option. As for the reason, let me quote yourself:

from a substantive standpoint, the simplicity of the oglm model and the insights about differences in variability across time and gender that are gained by adding only two parameters to the ordered logit model may be highly appealing.

That's the point: simplicity, i.e. one parameter per variable, and the rest is heterogeneity. I've performed this procedure:

5) For the moment, I think what I’m going to do is a backward stepwise approach by running the “oglm” command and A) using “lnsigma” p-values for “single” variables, B) performing a likelihood ratio test for linearly dependent ones.

but even using 0.01 as a threshold for exclusion, I get too many variables in my "hetero" list, then nothing turns out as significant (actually, my p-values are all above 0.26).

See

https://www.stata-journal.com/articl...article=st0208
.

Thank you. The document I was looking at was actually this one:

https://www3.nd.edu/~rwilliam/oglm/oglm_Stata.pdf

and yesterday I left office just after noticing that there's a way to perform a stepwise procedure just on the heteroskedasticity/scale variables, that is something I'd like to try.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4944
#9

06 Dec 2019, 14:13

Your post got me a little concerned so I did some checking.

I believe that, if the proportional odds constraint is imposed on ALL the categories of a factor variable, or else on NONE of them, it doesn't matter what you choose as the reference category. This code illustrates that:

Code:

webuse nhanes2f, clear gologit2 health weight height i.race, npl(1b.race 2.race 3.race) sto(m1) gologit2 health weight height ib2.race, npl(1.race 2b.race 3.race) sto(m2) gologit2 health weight height i.race, pl(1b.race 2.race 3.race) sto(m3) gologit2 health weight height ib2.race, pl(1.race 2b.race 3.race) sto(m4) esttab m1 m2 m3 m4, scalars(chi2 r2_p df_m)

Unfortunately, gologit2 does not currently provide an easy way to impose such an all-or-none restriction on factor variables. But, you can do it manually, like I did above. You could, for example, run autofit -- and if some categories of a variable violated proportional odds while others did not, you could relax the proportional odds restriction for all of them.

Sooner or later I will add options to make this easy to do. But it won't be this calendar year!

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#10

07 Dec 2019, 17:27

Originally posted by Richard Williams View Post

maybe you can just focus on the marginal effects and adjusted predictions. See, for example,

https://www3.nd.edu/~rwilliam/stats3/Margins05.pdf

I think marginal effects would be what I'd need in the binomial case. Because it seems to me they would give me effects on the counterfactual setting, i.e. in an ideal case where we could keep all other variables equal in the population and just modify the variable we are considering. But in my case, I’d get an effect on each of the 5 probabilities separately. What I’d like to have is a sort of mean across the 4 log odds ratios (or probabilities) among consecutive outcome values.

Originally posted by Richard Williams View Post

https://www.stata-journal.com/articl...article=st0208

Section 4.4 explains the issue that, once we include a variable in the “hetero” option, it’s a bit as if we included its interactions with all other variables, so that results are not invariant to the choice of the reference category (for categorical variables) and express an effect in correspondence to the “0” value for interval variables. With “lrtest”, it seems to me the test is of the type: “Ho: the given variable has no effect on the outcome” vs “H1: the given variable has an effect on the outcome”. What I’d need is slightly different, i.e.: “Ho: the given variable has no effect on the average outcome value (or the mean log odds ratio)” vs “H1: the given variable has an effect on the average outcome value (or the mean log odds ratio)”.

Originally posted by Richard Williams

I believe that, if the proportional odds constraint is imposed on ALL the categories of a factor variable, or else on NONE of them, it doesn't matter what you choose as the reference category.

I agree with it. But still, in my case I’d have 8 different combinations (constraint vs no constraint on each variables’ cluster). But what about the other 12 variables? I think I cannot use “autofit” just on them: if that option is used, it must be on all variables.

Originally posted by Richard Williams

Unfortunately, gologit2 does not currently provide an easy way to impose such an all-or-none restriction on factor variables. But, you can do it manually, like I did above. You could, for example, run autofit -- and if some categories of a variable violated proportional odds while others did not, you could relax the proportional odds restriction for all of them.

Yes, but I think there would still be the possibility to get different results by choosing different reference categories. For example in a regression one could have no significant variables by choosing a “central” category, while, with an “extreme” category as reference group, finding significance for the parameters related to the extreme categories on the other end of the spectrum.
I was pondering the constraint I’m thinking of should basically say that either all or none of the variables X1-X4 have lnsigma=0, thus something like: if lnsigma_X1*lnsigma_X2* lnsigma_X3*lnsigma_X4=0, then lnsigma_X1=0, lnsigma_X2=0, lnsigma_X3=0, lnsigma_X4=0 (by relying on the fact that the probability of an estimate to be exactly 0 is, in a continuous framework, 0).
Comment

Announcement

Proportional odds assumption in order logit regression: how to test multiple parameters simultaneously

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment