Mixed model with Cluster function

Yaliu He

Join Date: Sep 2017

Posts: 9
#1

Mixed model with Cluster function

07 Jan 2022, 09:40

I am trying to fit a three-level mixed-effects model: the client is nested within the therapist (clients may receive similar treatment and results from the same therapist), and there are repeated measures within the client (assuming there are correlations between measurement occasions with the same client).

The STATA command I used is: mixed IV DV || therapist: || client: time,ml

However, the client variable is not independent; 1/3 of the clients have their partner or family members in the same dataset. There is another variable called “case” that indicates the identifier of the family unit. I want to add a cluster variable using VCE (cluster CLUSTER Variable) at the client level.

The STATA command I used is: mixed IV DV|| therapist: || client: time, vce(cluster case)

Then I got this error message “highest-level groups are not nested within case”. While I understand case is indeed not nested within therapist, I wonder how I can count the nesting effect of case for these 1/3 clients while still using a three-level mixed model. Any suggestions? Thank you!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

19 Jan 2022, 13:42

Well, as Stata has told you, the variable specified in -vce(cluster )- must be at a higher level than that of the top level group in the model. That is a mathematical constraint that on which the calculation of cluster robust standard errors relies.

I think you have two choices here.

1. Introduce case as a new level in the model, between client and therapist. This is the approach I would probably take in this case. The fact that there are clients who have no other family members doesn't matter. There is no reason you can't have some singletons.

2. Alternatively, you can eliminate the problem by selecting only one client from each family. If I were going to do this, I would probably do it by random selection, although, depending on the research questions and the meaning of the variables, sometimes it makes sense to select the one who began therapy first, or the one who began last, or the oldest, or youngest, or something like that. The reason I don't really like this approach as much is that it entails discarding at least 1/3 of the data. Now, given that the discarding is attributable to clustering, you are not losing fully 1/3 of the information, but still... I would use this approach only in two situations: a) in a different sample where the number of discards would be just a handful, or b) in the event that the other approach failed because the introduction of the case: level into the model caused it to fail to converge.

Finally, I will add that if you go with the first suggestion and then the results indicate that there is very little variance at the -case:- level, you could just go back to your original model and ignore the whole issue (other than reporting that you dealt with it in this way and the results indicated that it was not necessary.)
Comment
Yaliu He

Join Date: Sep 2017

Posts: 9
#3

19 Jan 2022, 14:04

Hi Clyde, thanks for your suggestions. It makes so much sense. I did not know that I can still add another level even though not every participant has that level of nesting. I used the second method-randomly selecting one member from each couple/family case initially. As you said, it reduced the sample size. And reviewers were not happy because we might lose valuable information by not including couple and family cases. Now I will definitely try the first method as you suggested. Thanks a lot.
Comment
Yaliu He

Join Date: Sep 2017

Posts: 9
#4

11 Mar 2022, 14:59

Hi Clyde, I have a follow-up question regarding my previous post. I followed with your first suggestion and now am trying to determine if there is little variance at the -case:- level. Should I look at the random-effects parameters in the results, or do I need to add another command?
I copied the results here. "Newcasenumber" represents the "case" level. For example, the estimate for the case level is .19. Does it mean large variance or small variance?

Thank you!
Attached Files

results (1).JPG (0, 0 views)
Comment
Yaliu He

Join Date: Sep 2017

Posts: 9
#5

11 Mar 2022, 15:04

Sorry here is the results.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#6

11 Mar 2022, 16:51

The variance at the newcasenumber level is 0.19 (to 2 decimal places) and the residual variance is 0.46. So the former is approximately 40% as large as the latter. That is much too large a fraction to consider the newcasenumber level ignorable. If anything is ignorable here, it is the therapist level, which is less than 1% of the total of the therapist, newcasenumber, and residual variances. For a more precise estimate that also takes the random slopes into consideration and gives confidence bounds, use the -estat icc- command after the regression.
Comment

Announcement

Mixed model with Cluster function

Comment

Comment

Comment

Comment

Comment