Measures of fit for a logistic regression model

Jane Taylor

Join Date: Jun 2018

Posts: 2
#1

Measures of fit for a logistic regression model

04 Jun 2018, 02:11

Hi all,

I am conducting a study where I have measured 2 variables. I know that theoretically, there is a causal relation between the two variables, such that one of the variables (cause C), leads to the other variable (effect, E). I have measured C at time point T1 (Ct1), whereas E was measured at two timepoints, T1 and T2 (Et1 and Et2, respectively). C is categorical ordinal variable containing 4 levels/categories and E is a binary variable.

I have 600 unique subject IDs total, and from each ID the variables Ct1, Et1 and Et2 were measured for 8 different body parts, yielding 4800 entries in total.

I'm using logistic regression to calculate odds ratios for how C (measured at T1) predicts the development of E at T2, corrected for the presence of E at T1. Put in other words I want to see how C predicts development of new E at T2, corrected for preexisting E at T1.

Code:

. logistic Et2 i.Ct1 Et1, vce(cluster ID) Logistic regression Number of obs = 4,800 Wald chi2(4) = 245.01 Prob > chi2 = 0.0000 Log pseudolikelihood = -386.54839 Pseudo R2 = 0.3717 (Std. Err. adjusted for 600 clusters in ID) Robust Et2 Odds Ratio Std. Err. z P>z [95% Conf. Interval] Ct1 1 3.198967 1.232098 3.02 0.003 1.503714 6.805409 2 4.993883 3.840523 2.09 0.037 1.10618 22.54504 3 55.21636 33.82345 6.55 0.000 16.62088 183.4348 Et1 53.07789 20.83338 10.12 0.000 24.59303 114.5553 _cons .0118504 .0019137 -27.47 0.000 .0086352 .0162627 Note: _cons estimates baseline odds.

Despite a large sample, some of the cells have small numbers:

Code:

. tab Ct1 Et2 Et2 Ct1 0 1 Total 0 4,438 67 4,505 1 198 38 236 2 22 8 30 3 6 23 29 Total 4,664 136 4,800

My question is should I test the model fit when I am only including two predictors in the model?

As far as I understand the Pearson Chi-Squared Goodness Test is problematic if there are few observations for some of the values of the predictor variable(s) and the Hosmer–Lemeshow Test are best with more than five predictors and works best with continuous predictors. So neither of the two test seems to be good in my case, but are there any other diagnostic tests I should run?

Also, if you have any concerns regarding my approach in general, or if there is model testing I need to do, I appreciate if you let me know.

Best wishes,
Jane
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29948
#2

04 Jun 2018, 08:53

I think that the concerns you have about Pearson chi square GOF test and the Hosmer-Lemeshow test refer to the fact that both produce test statistics that have approximate chi square sampling distributions, and that the approximation is not so good under the circumstances you face.

That is all true. But, for my part, I have always felt that it is a category error to think of goodness of fit as a yes-no phenomenon. Both of these statistics, if you ignore the issue of their "significance" are clearly calculated as direct measures of the agreement between the data and the model predictions. They are reasonable definitions of goodness of fit under any circumstances. Indeed, if you ponder for a long time different ways might one assess the extent of agreement between the model predictions and the data, you don't come up with very many viable alternatives (although there are some). Whatever measures of fit you use, however, you cannot escape the fact that small cells produce noisy results.

My own approach to assessing calibration in logistic models is usually to calculate observed and expected outcome probabilities in group defined by percentiles of risk (sometimes the classical deciles, sometimes more or fewer groups depending on the granularity of the data and the sample size) and look at O and E in each of those groups. I then explore them (often graphically) to try to learn where the model fits the data well and where it looks like it needs improvement. Often knowing where the model fits best and not so well enables you to make a good guess about what changes to the model would improve it. And, in any case, it leaves you in a good position to appraise how well and where the model fits the data. Sometimes when the fit looks poor in a small cell, looking at (O-E)^2/E for that cell as a 1df chi square statistic in that group is helpful in appraising to what extent that degree of poor fit is just consistent with the noisiness of a small cell's results. (And if the cell is so small that (O-E)^2/E isn't remotely like a chi square statistic, you can always calculate the likelihood of observing O as a binomial outcome with expected value E in a sample of that size.)

Notice what I don't do. I don't bother with the total (O-E)^2/E chi square statistic, as I think the summary obliterates important details. I don't look at the p-value at all, because I think of goodness of fit as a quantitative thing, matter of degree and not uni-dimensional, and not a yes-no that can be "tested." (I think, by the way that goodness of fit "tests" are an outstanding example of the misuse of null hypothesis significance testing. The null hypothesis that the model is correct is almost always false a priori. The question is not whether the model is right, we know it isn't, but whether it is useful. Hat tip Box.)
Comment
Jane Taylor

Join Date: Jun 2018

Posts: 2
#3

06 Jun 2018, 09:49

Dear Clyde,

Thanks a lot for your reply - I really appreciate it.

Kind regards,

Jane
Comment

Announcement

Measures of fit for a logistic regression model

Comment

Comment