Poisson regression - model diagnostics

Emily Tweed

Join Date: Jun 2015

Posts: 26
#1

Poisson regression - model diagnostics

06 Jul 2015, 04:49

Hello all,

I’m using Stata 12.1 to analyse data on the number of cases of cancer in the population during three-year periods by quintiles of a socioeconomic indicator (SIMD). My underlying data consists of strata containing counts of cases and population estimates by sex, age group, SIMD quintile and time period.

I am using the poisson command in Stata, adjusting for age group and stratifying by year group and sex.

For example,

Code:

xi: poisson numcases i.simd_quin i.age_broad if yeargroup==0 & sex==0, exposure(population) irr xi: poisson numcases i.simd_quin i.age_broad if yeargroup==0 & sex==1, exposure(population) irr

Very few of the strata contain zero cases - I have run ZIP models with the same covariates, finding no evidence of excess zeros from the Vuong test.

I have also run negative binomial models with the same covariates to check for overdispersion – the LR test results suggest this is not a problem.

However, when I rerun the model using the glm poisson command instead of the standard poisson command, predict the deviance residuals and plot them on a pnorm plot, they appear to be very non-normal. Similarly, when I plot the residuals against the linear predictor, it looks like the assumption of constant variance of residuals is also violated.

Although I have only included two covariates in the model (SIMD quintile and age group), so it might appear under-specified, this is because I have stratified for the others I think a priori are likely to be relevant (sex and time period).

I’m therefore unsure where the problems with non-normality and heteroscedasticity are coming from and how to solve them. My reading suggests that calculating robust standard errors in the Poisson model might help – however, I am unsure whether it is possible to do this and then re-check the normality/constant variance assumptions?

Any suggestions would be very gratefully received – and if anything in the above description is unclear, I’m more than happy to clarify.

Many thanks

Emily
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35429
#2

06 Jul 2015, 04:59

I think you are imputing classical linear regression assumptions where they don't belong.

With a Poisson regression model, there's no assumption that residuals are normal, and they can't be; similarly with any homoscedasticity assumption. There are various ways to see this. One is to realise that even ideally the variance of a Poisson must increase with the mean.

Using robust standard errors makes no difference to coefficient estimates and therefore no difference to the (raw) residuals.

I'd be most concerned to try to identify whether the functional form was about right.
Comment
Emily Tweed

Join Date: Jun 2015

Posts: 26
#3

06 Jul 2015, 05:24

Hi Nick

Thanks very much for getting back to me so promptly: your answer was very helpful and has made several things a lot clearer for me.

Looking back at the teaching materials I was using for reference, it seems to suggest the pnorm plot as a means of assessing goodness of fit of the Poisson model - which I think is where my mistaken belief about the assumptions arose. Even if this isn't an assumption of Poisson regression, is there any benefit in assessing the normality of residuals (and for that matter, their variance)?

And finally, just to clarify for this novice - by functional form, do you mean the choice between Poisson, negative binomial etc?

Many thanks again

Emily
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35429
#4

06 Jul 2015, 05:31

Functional form: I mean whether your particular Xb matches the patterns in the data, e.g. whether one or more of your predictors should be transformed, or indeed what the predictors should be. Added variable plots or residual versus predictor plots are more helpful here than any plots focused on the residual distribution.

Teaching materials: I just know they weren't my teaching materials. If normality is a reference, then I'd always use qnorm any way.
Comment
Emily Tweed

Join Date: Jun 2015

Posts: 26
#5

06 Jul 2015, 05:35

Thanks.
Comment

Announcement

Poisson regression - model diagnostics

Comment

Comment

Comment

Comment