Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hosmer–Lemeshow test for large data sets

    Hosmer–Lemeshow test for large data sets gives low p value despite perfectly fitting model.
    Increasing the number of groups from 10 to 13, would results in a perfect calibration (p value > 0.05)
    My question, is there a methodology for how to increase the group numbers or just increase to any number would work.
    Thanks

  • #2
    Your post is lacking in specifics. But, in general, statistical tests carried out on large data sets will be prone to making trivial differences "statistically significant." Fortunately, the American Statistical Association has now recommended that we abandon the concept of statistical significance anyway. https://www.tandfonline.com/doi/full...5.2019.1583913. So don't worry about whether p < 0.05 or not: it's meaningless in this context.

    The real way to use the Hosmer Lemeshow test is to specify the -table- option and look at the actual observed and expected values in the different groups. This will enable you to see how close the fit really is, and also whether there are patterns that might suggest a change in the underlying model, such as if the fit is really tight in the middle but poor at the ends, etc.

    As for the number of groups, you can set that to anything you like with the -group()- option. And, yes, in large data sets, you should use more than the conventional 10. Just make sure that the number you pick is small enough that each of the groups has an ample size of at least 50 observations in it. Also bear in mind that if your model has only discrete predictors, then the number of groups you can pick is limited by the number of combinations of levels of those predictors.

    Comment


    • #3
      for a cite with guidance on the number of groups and how to choose them, see #13 in https://www.statalist.org/forums/for...on-survey-data

      Comment

      Working...
      X