melogit random effects are perfectly correlate

Jeannie Biniek

Join Date: Apr 2018

Posts: 4
#1

melogit random effects are perfectly correlate

12 Apr 2018, 11:17

I am interested in estimating the likelihood that a patient has a prescription for a specific drug. Beneficiaries are nested within physicians. I have observations for two periods pre and post an FDA safety alert. I am interested in getting physician-specific effects for their relative level of prescribing the drug in each of the two periods.

So, I have: melogit y post || physician: post, cov(un)

However, when I estimate the physician level random effects using reffects bpst bint, reffects I get values that are perfectly correlated. I know that physicians who had a higher share of benes on the relevant drug before the safety alert had a higher share after the alert, but these are definitely not perfectly correlated. Where am I going wrong?

Thanks
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

12 Apr 2018, 11:57

What are the statistics that you say are perfectly correlated? What you say about higher share before and higher share after makes me think you are trying to estimate some correlation between pre- and post- values of y. You will not get that from the predicted values of the random effects--those have essentially nothing to do with that. Perhaps you are looking for the intra-class correlation? That would come from -estat icc-
Comment
Jeannie Biniek

Join Date: Apr 2018

Posts: 4
#3

12 Apr 2018, 12:09

The statistics that are perfectly correlated are the physician-level random effects from the model - the random intercept and the random slope on post. I am not interested in the correlation between these two, except I know that they shouldn't be perfectly correlated, which leads to me to suspect I'm either specifying the model wrong, or there is some other problem I haven't been able to diagnose.

Physicians had different responses to the FDA warning. Some abandoned the drug altogether, others prescribed it for fewer patients, while others didn't change their prescribing at all. I'm hoping to characterize physicians based on this response. So, that's what I mean by share before and after. The data is at the patient level, with an indicator equal to 1 if the patient had a prescription for the drug in the period and 0 if they did not. Not all patients are in both periods, but all physicians are in both periods. Each patient is assigned to a physician.

Thanks
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

12 Apr 2018, 12:18

Well, I think you need to show the complete output of your model, along with any subsequent commands and output leading to the correlation you got.
Comment

Jeannie Biniek

Join Date: Apr 2018
Posts: 4

12 Apr 2018, 12:44

Here is the output and a scatter plot of the estimated random effects.

Code:

meqrlogit TZD post nrx || PCP_assgn: post, cov(unstructured) or

Refining starting values:

Iteration 0: log likelihood = -138007.13
Iteration 1: log likelihood = -136502.26 (not concave)
Iteration 2: log likelihood = -136115.21

Performing gradient-based optimization:

Iteration 0: log likelihood = -136115.21 (not concave)
Iteration 1: log likelihood = -135991.65
Iteration 2: log likelihood = -135776.02
Iteration 3: log likelihood = -135755.36
Iteration 4: log likelihood = -135750.38
Iteration 5: log likelihood = -135750.34
Iteration 6: log likelihood = -135750.34

Mixed-effects logistic regression Number of obs = 329,599
Group variable: PCP_assgn Number of groups = 13178

Obs per group:
min = 10
avg = 25.0
max = 215

Integration points = 7 Wald chi2(2) = 36171.59
Log likelihood = -135750.34 Prob > chi2 = 0.0000

TZD	Odds Ratio	Std. Err	z	P>z	[95% Conf. Interval]
post	.4972945	.0055507	-62.59	0.000	.4865335 .5082935
nrx	4.029187	.0302935	185.35	0.000	3.970249 4.089001
_cons	0.322662	.0005085	-217.89	0.000	0.312848 .0332784

Random-effects parameters	estimate	std. err.	[95% conf. interval]
PCP_assgn: unstructured
var(post)	.0042297	.0016898	.0019331 .0092548
var(_cons)	.4968397	.0140903	.4699766 .52523282
cov(post, _cons)	0.458421	.0088103	.0285743 .0631099

LR test vs. logistic model: chi2(3) = 8972.15 Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

Code:

predict bpst bint, reffects reses(se_pst se_int)

Code:

corr bpst bint

(obs=329,599)

| bpst bint
-------------+------------------
bpst | 1.0000
bint | 1.0000 1.0000

Click image for larger version

Name: scatter_reffects.png
Views: 1
Size: 16.1 KB
ID: 1439095

Thanks

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#6

12 Apr 2018, 13:16

Thank you. I am stumped. I don't see anything wrong with the code, and I agree that the results really do not make sense.

Have you tried this with -melogit- instead of -meqrlogit- (same model, different estimation algorithm)? Do you get the same thing?
Comment
Jeannie Biniek

Join Date: Apr 2018

Posts: 4
#7

12 Apr 2018, 13:18

I have. And yes, I get the same thing - it just takes a little bit longer to converge. Thanks for looking.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#8

12 Apr 2018, 15:07

The code below, from one of Stata's sample datasets, would suggest it's not a bug in meqrlogit per se.

Code:

webuse bangladesh meqrlogit c_use urban age child* || district: urban, cov(unstr) estat sd predict r_urban r_cons, reffects twoway scatter r_urban r_cons corr r_urban r_cons

Jeannie, if you run -estat sd- after your own command, it will show the estimated standard deviation of each random effect, and the estimated correlation. Even in the sample output, the model-estimated correlation of the random effects doesn't match what you get when you predict, then correlate, because in the latter, you've got individuals as the unit of observation, whereas I'm pretty sure the model-estimated correlation would be using the physician as the unit of observation. That doesn't explain why yours is 1.0, however.

I agree the code looks correct. My guess would have to be something strange about the data. But I can't quite articulate what. As a side note, can you double check what you posted about the variances and covariance of the random effects, and can you report the results of -estat sd-? Mathematically, the correlation is covariance(xy) / [std. dev(x) * std. dev(y)]. The standard deviation of the random effect is the square root of its variance. When I run this formula on the output from my sample code, I obtain the correct value for the correlation of the random effects that -estat sd- reports. In your output, I can't get a sensible answer for correlation (I got 10.000 something). If yours don't match, then it could be something off with your data, or perhaps it's worth checking your Stata version and if you've updated it using -which- and -update-. I'm on Stata 15.1, the 11/4/2018 update.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Hassen Ali

Join Date: May 2018

Posts: 39
#9

09 Jun 2018, 05:47

Dear all,Thank you very nuch!! I have learned a lot from your posts.
Cheers , Hassen
Comment

Announcement