ordered logistic regression model: Which R2 shall one use (if at all)?

Andreas Head

Join Date: Jun 2014

Posts: 60
#1

ordered logistic regression model: Which R2 shall one use (if at all)?

03 Jan 2016, 02:01

Hi everyone,
I found different opinions on which R2 is used best for ordered logisitic models (using Stata's -ologit- command). Stata's post regression -fitstat- command allows to view several R2 such as McFadden's R2, Nagelkerke R2, McKelvey & Zavoina's R2 etc. It becomes obvious that the respective R2 values vary much between the different methods of calculating R2.
Others state that one should better not report any at all, since they are all misleading.

However, I am running a couple of order logistic models and would like to compare whether the inclusion of variables lead to an increase in the model fit. Therefore, I would like to report an R2 for each model, but I am not sure which one is used best.

I was hoping that the experience of the statalist community might be able to help me with my decision.

Thank you,
Andreas
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#2

03 Jan 2016, 07:41

You can report the log likelihood for each model: 2 times the difference in in log likelihood is the likelihood ratio test statistic when the models are nested and you don't ask for robust standard errors. Alternatively you could report the BIC or AIC, which can also be used to compare models.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#3

03 Jan 2016, 07:59

I'll immodestly recommend my own article on this subject: Lacy, M. G.. 2006. "An Explained Variation Measure for Ordinal Response Models With Comparisons to Other Ordinal R2 Measures." Sociological Methods and Research. 34:469-52. I have an associated package -r2o- at SSC.

One of my findings is that, if you are using ordinal R2 measure to compare models, it probably does not make much difference which one you use. However, certain measure perform better than other with respect to bias and precision.

Regards, Mike
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#4

04 Jan 2016, 16:48

Thanks Martin and Mike for your helpful reponses. I tried to perform a likelihood-ratio test using Stata's

Code:

lrtest

. However, Stata reported an error message after comparing model 2 with model 3:

Code:

. lrtest m2 m3 df(unrestricted) = df(restricted) = 22 r(498);

I checked for missing values and sample size and found no missings and equal sample sizes across all models.
So I wonder why I get this error message?

If I cannot resolve that problem, I might end up picking the BIC or AIC as suggested by Martin to compare my models.
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#5

04 Jan 2016, 16:54

I think I found the problem, but would like to get a feedback here to see whether I got it right.
The DV (an index, ranging from 1, 1.5, 2, 2.5,...5) of the six models that I run and compare has 9 categories in total of which category1, 2, and 2.5 only have very low values. The category 1.5 even has none.
Am I right to assume that this might cause the lrtest to fail?
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4946
#6

04 Jan 2016, 17:21

You should show us the ologit command and its output. Use code tags (see pt 12 of the FAQ). If both models have the same DF then they must not be nested. Adding more variables to a model should change the df. Or maybe you did not use estimates store correctly, e.g. maybe m2 and m3 are actually the same model. In any event lrtest is a post-estimation command and we can't really advise you without seeing what the prior commands were like.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Andreas Head

Join Date: Jun 2014
Posts: 60

04 Jan 2016, 17:32

Thanks Richard. Here is the code I used and the ouptut:

Code:

eststo clear

.  ologit riskpindex age income sex i.educ  distanceocean remoteness ccreallyhap  oftengetinfo   percCCknow countimps obsECgen    ///
>  CCconcern otherprobmore  hhaffected,or

Iteration 0:   log likelihood = -347.00617  
Iteration 1:   log likelihood = -300.41249  
Iteration 2:   log likelihood = -296.64682  
Iteration 3:   log likelihood = -296.61849  
Iteration 4:   log likelihood = -296.61849  

Ordered logistic regression                       Number of obs   =        200
                                                  LR chi2(15)     =     100.78
                                                  Prob > chi2     =     0.0000
Log likelihood = -296.61849                       Pseudo R2       =     0.1452

-------------------------------------------------------------------------------
   riskpindex | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
          age |   1.006239   .0100308     0.62   0.533     .9867696    1.026092
       income |   1.342209   .1554166     2.54   0.011     1.069691    1.684156
          sex |   .8046316   .2157594    -0.81   0.418     .4757186    1.360956
              |
         educ |
           2  |    .974215   .3081496    -0.08   0.934     .5241011      1.8109
           3  |   1.235317    .602259     0.43   0.665     .4751043    3.211945
              |
distanceocean |   1.001554   .0019597     0.79   0.427     .9977203    1.005402
   remoteness |   .9951843   .0033507    -1.43   0.152     .9886387    1.001773
  ccreallyhap |    1.85976   .2668829     4.32   0.000     1.403803    2.463813
 oftengetinfo |   1.187779   .1475324     1.39   0.166     .9311269    1.515173
   percCCknow |   1.131218   .1430019     0.98   0.329     .8829633    1.449273
    countimps |   1.149413    .120243     1.33   0.183     .9363315    1.410985
     obsECgen |    1.00423   .1834068     0.02   0.982      .702061    1.436454
    CCconcern |   .9066822   .1549419    -0.57   0.566     .6486258    1.267407
otherprobmore |     .95827   .1208088    -0.34   0.735     .7484748     1.22687
   hhaffected |   3.998224   1.328117     4.17   0.000     2.085056    7.666843
--------------+----------------------------------------------------------------
        /cut1 |  -.8207426   1.317851                     -3.403684    1.762199
        /cut2 |   .2203631    1.20746                     -2.146215    2.586942
        /cut3 |   1.317247   1.171571                     -.9789902    3.613484
        /cut4 |   3.730165   1.181367                      1.414729    6.045601
        /cut5 |   4.304286   1.187698                       1.97644    6.632132
        /cut6 |   5.816799   1.214914                      3.435611    8.197986
        /cut7 |   7.388136   1.241991                      4.953878    9.822394
-------------------------------------------------------------------------------

. estimates store m2

.                                                                                                                                                                                                       
>                                                                           
.  ologit riskpindex age income sex i.educ  distanceocean remoteness ccreallyhap  oftengetinfo   percCCknow countimps obsECgen    ///
>  CCconcern  otherprobmore  trustcoping,or  

Iteration 0:   log likelihood = -347.00617  
Iteration 1:   log likelihood = -304.89838  
Iteration 2:   log likelihood = -301.73521  
Iteration 3:   log likelihood = -301.71076  
Iteration 4:   log likelihood = -301.71076  

Ordered logistic regression                       Number of obs   =        200
                                                  LR chi2(15)     =      90.59
                                                  Prob > chi2     =     0.0000
Log likelihood = -301.71076                       Pseudo R2       =     0.1305

-------------------------------------------------------------------------------
   riskpindex | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
          age |   1.006242   .0102168     0.61   0.540     .9864156    1.026467
       income |   1.467126   .1708475     3.29   0.001     1.167735    1.843275
          sex |   .8317945   .2238739    -0.68   0.494     .4908162    1.409656
              |
         educ |
           2  |   1.049455   .3326349     0.15   0.879     .5638552    1.953262
           3  |   .9097132   .4419348    -0.19   0.846     .3510706    2.357299
              |
distanceocean |    1.00186   .0019757     0.94   0.346     .9979948     1.00574
   remoteness |    .992031   .0033524    -2.37   0.018     .9854822    .9986233
  ccreallyhap |   2.090615   .2931786     5.26   0.000       1.5882    2.751964
 oftengetinfo |   1.156272   .1433549     1.17   0.242     .9068339    1.474321
   percCCknow |   1.055765    .133281     0.43   0.667     .8243482    1.352147
    countimps |   1.231524   .1290162     1.99   0.047     1.002928    1.512223
     obsECgen |   .9260873   .1668578    -0.43   0.670     .6505589    1.318309
    CCconcern |   1.018006   .1733177     0.10   0.917     .7291745    1.421246
otherprobmore |    .976343    .124014    -0.19   0.850     .7611737    1.252336
  trustcoping |     .69668    .092177    -2.73   0.006     .5375411    .9029318
--------------+----------------------------------------------------------------
        /cut1 |  -2.088389   1.473841                     -4.977064    .8002855
        /cut2 |  -1.017438    1.36967                     -3.701941    1.667065
        /cut3 |   .0922356   1.331015                     -2.516505    2.700976
        /cut4 |    2.39072    1.32518                     -.2065847    4.988024
        /cut5 |   2.919006   1.328502                      .3151909    5.522822
        /cut6 |    4.35445   1.346406                      1.715543    6.993357
        /cut7 |   5.904214   1.362903                      3.232973    8.575455
-------------------------------------------------------------------------------

. estimates store m3

.
end of do-file

. do "C:\Users\t410\AppData\Local\Temp\STD01000000.tmp"

. lrtest m2 m3
df(unrestricted) = df(restricted) = 22
r(498);

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4946
#8

04 Jan 2016, 17:38

It looks like you changed the last variable rather than adding one. The DFs are therefore the same, leading to the lrtest message you got. BIC tests don't require nested models, although in this case I am not sure why you wouldn't just nest rather than change 1 variable out of 15.

If you want BIC values, you can add the stats and force options to lrtest. See the help for the command.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Andreas Head

Join Date: Jun 2014

Posts: 60
#9

04 Jan 2016, 17:54

I stepped back from employing nested regressions sincethere is no clear hierarchy hypothesised in the models.
Thus, I decided to use a basic model (M1), and add one variable to the basic model respectively (that is what you can see in the ouput above) to check for the influence of these variables. In the final model (M6), i include all variables simultaneously.
Would it be okay to just report the BIC for each model and compare them by that?
Comment

Announcement

ordered logistic regression model: Which R2 shall one use (if at all)?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment