estimates diverging (correlation > 1)

Paul Sacamano

Join Date: May 2015

Posts: 7
#1

estimates diverging (correlation > 1)

13 Aug 2015, 11:33

I am trying to perform a simple bivariate analysis of multilevel cross sectional data representing egocentric social networks at baseline. Level 1 includes the characteristics of peers/alters and their ties with the ego, which are clustered within egos (survey respondents) representing level 2. I have binary level 2 outcomes (ED or hospitalization in the prior 6 months) and the level 1 exposures variables are primarily categorical (sex, race, type of support, etc).

I understand there are several options in Stata for modeling multilevel data. With xtgee, I'm consistently getting the following error: estimates diverging (correlation > 1).
xtgee emerw1 gendx1, family(binomial) link(logit) i(wspid) corr(exch) vce(robust) eform

With meqrlogit the model does not converge.
meqrlogit emerw1 gendx1 || wspid: , cov(exch) or

Here's a cross tab of the exposure and outcome
Z1 6 months |
received care | B9a gender
in ER | Male Female | Total
---------------+----------------------+----------
No | 1,672 1,857 | 3,529
| 61.25 59.61 | 60.38
---------------+----------------------+----------
Yes | 1,058 1,258 | 2,316
| 38.75 40.39 | 39.62
---------------+----------------------+----------
Total | 2,730 3,115 | 5,845
| 100.00 100.00 | 100.00

Pearson chi2(1) = 1.6171 Pr = 0.203
Tags: None
Paul Sacamano

Join Date: May 2015

Posts: 7
#2

13 Aug 2015, 13:59

Failed to mention range of cluster sizes

Mean: 9.9
Std. Dev: 4.8
Min: 1
Max: 30

If I limit the xtgee to clusters of at least size 10 the correlation error exceeding 1 no longer occurs; if I limit cluster size to 5 with melogit the models will converge. However, this is not a workable solution as I'll be loosing a lot of information for small cluster sizes....

Last edited by Paul Sacamano; 13 Aug 2015, 14:10.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4368
#3

13 Aug 2015, 19:03

You'll probably end up having to do what Prof. Gary Koch recommended (in the analogous SAS PROC GENMOD code):

Code:

xtgee emerw1 gendx1, family(binomial) link(logit) i(wspid) corr(independent) vce(robust) eform

That is, rely on the robust standard errors (EMPIRICAL, I believe is the analogous SAS option) to accommodate within-cluster correlation, which in conjunction with a valid assumption that you've got the means properly specified will give you consistent estimates.
Comment
Paul Sacamano

Join Date: May 2015

Posts: 7
#4

27 Aug 2015, 09:33

Just getting back here.....Thanks for the response.

My follow up question would be:

1. If I'm assuming the correlation to be independent then do I even need to use GEE? Would it not then become a logistic regression model where my covariates become a cluster mean of the level 1 observations for that predictor (or proportion for binary predictors)?

2. Unlike mixed methods, is it the case that the GEE correlation structure does not affect the marginal parameter estimates, but does affect the standard error estimates?

3. I have unequal numbers of members across clusters, ranging from 1 to 30. Would specifying an independent structure lose the ability to weight clusters with more members as providing more information than smaller clusters?

Thanks again.

Last edited by Paul Sacamano; 27 Aug 2015, 10:12.
Comment
Paul Sacamano

Join Date: May 2015

Posts: 7
#5

27 Aug 2015, 11:00

Correction to #1 above: logistic regression model using cluster mean values for level 1 covariates would not give the same parameter estimate as GEE..; however I'm still concerned about having such a range of cluster sizes and wonder if I need to add weighting
Comment
Marcello Schmidt

Join Date: Sep 2021

Posts: 1
#6

22 Sep 2021, 10:59

Originally posted by Joseph Coveney View Post

You'll probably end up having to do what Prof. Gary Koch recommended (in the analogous SAS PROC GENMOD code):

Code:

xtgee emerw1 gendx1, family(binomial) link(logit) i(wspid) corr(independent) vce(robust) eform

That is, rely on the robust standard errors (EMPIRICAL, I believe is the analogous SAS option) to accommodate within-cluster correlation, which in conjunction with a valid assumption that you've got the means properly specified will give you consistent estimates.

Joseph Coveney : Could you provide the source for this, please? I am having the same problem and using corr(independent) and vce(robust) worked. Could not find the exact source searching the internet.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4368
#7

22 Sep 2021, 13:01

It was an oral comment that he made at an annual meeting (of the local chapter of the Drug Information Association, I believe), in Tokyo in the first postmillenial decade or so. After one of the presentations, someone in the audience brought up the question of how to choose the most appropriate structured working correlation matrix when modeling longitudinal data with GEE in SAS. His comment was to the effect of (paraphrasing) "why bother, when you can handle the problem with empirical standard errors?" As for something in print that you could point to, he might have broached the topic in the pertinent chapter (Chapter 15) of the book that he coauthored with Maura E. Stokes and Charles S. Davis, Categorical Data Analysis Using SAS, Third Edition (SAS Institute, 2012).
2 likes
Comment

Announcement

estimates diverging (correlation > 1)

Comment

Comment

Comment

Comment

Comment

Comment