Two groups within clusters

Guest
#1

Two groups within clusters

13 Apr 2015, 09:46

I would like to develop an analysis that looks into cluster-level effects of predictors on two different groups, preferably estimated in a unified analysis.

Consider girls and boys (groups) in many schools (clusters). Scores for the dependent variable vary substantially across schools, but differently so for girls and boys. Effects of predictors too vary for the two genders. Rather than doing straight-forward group-based analyses (one multilevel for boys, one multilevel for girls), I would like to have an analysis that integrates data from both groups. In principle, the cluster level could have different dependent variables, within a unified analysis, one dependent variable for boys, one dependent variable for girls, allowing for estimations of how these two different variables are related with each other within clusters and how a predictor affects them differntly, potentially in opposite ways.

I haven'f found any example of an analysis using this approach, but I am aware of a "web note" by the Mplus team (web note 16) that takes up this challenge within the framework of multilevel modelling, but without giving clear answers.

I am not sure a variant of multilevel modelling is the only way to go, even though this was my starting point. I could focus on the cluster-level only, thus not necessarily using multilevel modelling.

Any thoughts would be highly appreciated.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29959
#2

13 Apr 2015, 11:05

It sounds as if you can do this as a multi-level model that includes interaction terms between sex and the other predictor variables.
Comment
Guest
#3

13 Apr 2015, 11:12

Yes, that's one option. Agree. But such a cross-level interaction is usually difficult with more than one moderation (or maybe two) at a time -- in my experience, at least. If there are other suggestions (even outside multilevel), I'm eager to hear. But it may be that Clyde's suggestion is the way to go, rather than trying develop something more complex.
Comment
Guest
#4

13 Apr 2015, 11:18

... hm tinking of it... Clyde, it would be gender (level 1) moderating effects of a predictor at level 2. Cross-level interaction is usually the other way 'round. In single-level moderation, we do not distinguish between A moderating B and B moderating A (mathematically, even though we may do so theoretically). In cross-level interaction it is more evident which variable moderates the other, level 2 moderating a link at level 1. But I want it the other way, a variable at level 1 moderating the effect of a variable at level 2. I'm not sure how to do that.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29959
#5

13 Apr 2015, 11:46

Maybe you're "overthinking" it. Whether within a single level (where you have already noted this in #4) or across levels, "moderation" is a symmetric relationship. What we consider the moderator and what the moderated effect is a matter of perception and interpretation, not mathematics.
Comment
Guest
#6

13 Apr 2015, 12:20

That approach seems to give one DV mean at the cluster level (one random intercept), but the cluster mean varies substantially between the two groups.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29959
#7

13 Apr 2015, 13:10

Are you including the group (sex) in the random effects at the cluster level as well as among the fixed effects?
Comment
Guest
#8

13 Apr 2015, 13:13

No... Only at level 1. And that's part of the problem.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#9

13 Apr 2015, 16:04

I don't think it makes sense to have gender as a "group." For one, the statistical properties of the estimators would be unknown (and likely not good). The approximations only work well when you have a large number of groups Two is not going to cut it. (And the number of schools is essentially irrelevant if you allow correlation within gender -- if I understand what you're getting at.)

HLMs are not intended to allow for individual characteristics to decide levels. A school is a natural grouping; gender is not. This should be tied to cluster sampling, where clusters of schools are sampled. We do sample two gender clusters. If you group by gender why not race or education? Neither of these would be appropriate, either.

My suggestion is to include gender as a covariate and possibly interact with other covariates of interest. There is only one group, and that is school.

Here's one way to think about it. Suppose someone gave you a random sample, so that students appearing in the same school was incidental. In fact, suppose school is not known. But test score and gender is known. Would you ever think of using an HLM with the individual as level one and gender as level two? That would not be appropriate, and probably wouldn't work mechanically, anyway.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29959
#10

13 Apr 2015, 16:32

Reading Jeff Wooldridge's post made me aware that the original poster's description is ambiguous. It is an unfortunate fact of life with HML's that different people use a different ordering when describing their models. So it wasn't actually clear what the original poster has in mind when referring to level 1 and level 2. I had assumed that the clusters referred to schools and that sex would be used as an ordinary covariate. That is, I was assuming what Jeff is recommending with regard to what is nested in what here.
Comment
Guest
#11

13 Apr 2015, 23:59

My point is the following:

1. We have clusters (I used schools as an illustration)
2. We have groups withim clusters, I used genders as an illustration.

Level 1 = within (students)
Level 2 = between (schools)

Obviously, we wouldn’t use gender as a clustering variable. It is a level 1 variable. But this is a problem, because effects of level 2 variables are different for the groups at level 1 (here illustrated with gender). As originally stated, this might be dealt with by using multi-group analysis (one multilevel for boys, one multilevel for girls). But this isn’t really satisfying for me and I wondered whether I could use an approach that handles groups within clusters better (while groups here are not to be considered a second level of clustering).

So I cannot agree with the statement “There is only one group, and that is school.” If school-level predictors have different effects for boys and girls, then we have groups at the “within-level”, level 1.

I'd like an analysis that shows how effects of level 2 (schools) are moderated by a dichotomous variable at level 1, while acknowledging that the random intercept for the DV varies across the two groups.

It seems to me that the only way to handle this is by using multi-group analysis, one multilevel for boys and one multilevel for girls. This is a problem, because effects of a level 2 predictor on boys can be explained by what effects it has on girls.
Comment
Oded Mcdossi

Join Date: Jun 2014

Posts: 577
#12

14 Apr 2015, 03:21

Dear Guest, You should be more specific with the description of your variables. What's your level 2 (school level) variables and what your level 1 (individual level) variable. Based on poster #4

"In cross-level interaction it is more evident which variable moderates the other, level 2 moderating a link at level 1. But I want it the other way, a variable at level 1 moderating the effect of a variable at level 2."

Take for example the following cross level interaction between gender (individual level) and schools sector. This model estimates whether the gender gap in math score varies between public and private schools.

Code:

use http://www.ats.ucla.edu/stat/stata/examples/mlm_imm/imm10, clear g female=sex==2 xtmixed math i.female##i.public || schnum: female, variance covar(un) mle

In mixed model you can also estimate whether the advantage (or disadvantage) of female in math score is due to over or under representation of females in schools (in other words, is it individual or climate effect), the following model tries to answer this question:

Code:

bys schnum: egen mean_female=mean(female) xtmixed math i.female c.mean_female || schnum: , variance covar(un) mle

You may also ask (generally, if both variables are with significant effect) how the gender gap varies between schools that female are majority or minority, the following model try to answer this question

Code:

xtmixed math i.female##c.mean_female || schnum: female, variance covar(un) mle

I hope this helps, if not please provide us more details and maybe someone would answer.

Last edited by sladmin; 11 Dec 2017, 09:35. Reason: anonymize poster
Comment
Guest
#13

14 Apr 2015, 03:46

Thanks, Oded Mcdossi. I will study your suggestion in detail. I understand your request for being more specific concerning the variables (beyond schools being the clusters and gender giving groups within clusters).

I used that as an illustration of a general problem in multilevel modelling. I actually have a different type clusters and a different type of grouping at the individual level. I think it is fair to say that I am looking at something that appears to be a general limitation in standard multilevel modelling and that is not dependent on which variable we want to explain. Also, it appears I'm not the first to have thought of this. See https://www.statmodel.com/examples/w.../webnote16.pdf

I have asked Joop Hox for advice. I'll get back to this list if he responds.

Guest

Last edited by sladmin; 11 Dec 2017, 09:47. Reason: anonymize poster
Comment
Oded Mcdossi

Join Date: Jun 2014

Posts: 577
#14

14 Apr 2015, 04:34

The request for more details is, to my perspective, may help you and us to advance your understanding regards the right model to use. In mixed models you can use natural level 2 variables as these variables are characteristics of the level 2. School sector and school size are kind of natural level2 (schools) variables. However, you can also aggregate level1 (individual) variables (mean_ses percent_female etc.) and add it to your model as a level2 variable and I think this difference makes a difference when you ask what moderate what in cross level interaction.
I'd be happy if you share with us what you've found.
Comment
Guest
#15

14 Apr 2015, 05:00

From Joop Hox I received the following:
"I agree with the discussion that gender is not a group as in grouped hierarchical data. With 2 groups this would not work, and gender is just a variable at the individual level. I think I would explore the possibility of having a multilevel structure with a multi-group (2-group) model at the lowest level. I think Mplus should be able to handle this, probably via the mixed model + known class method."

Comment:
This suggestion by Joop is consistent with my original idea. But I am still uncertain it will suffice for what I want to achieve. I see some interesting suggestions in the Web Note I referred to above (by Asparouhov and Muthén), including the following:

"In model H3 we used the define statement to split the clusters in two clusters by adding gender*3000000. The value 3000000 is bigger than the maximum cluster value in the original data file and with that statement we are guaranteed that when gender=0 all cluster values are below 3000000 and when gender=1 all values are above 3000000, i.e., each clusters is split by gender to form two new clusters."

I didn't think of that solution. But it seems to follow up on my concern that I needed to account for the grouping at the individual level because effects of cluster variables may differ between the two individual-level groups and still be be interrelated. (I now leave aside that such an analysis might assumed the two halves of the new clusters to be independent from each other, while they are in fact not.

I haven't looked into the details yet, I don't know how the authors handle this.) Whether this would really be a good alternative to using a group-based multilevel analysis (or "known class" within clusters), is a different question. I will play with different models. I think the easiest solution is to settle with mulitlevel and "known class", and this is also what Joop recommended. Those interested, may consult the web note I referred to, most of it will apply to Stata as well.
Comment

Announcement

Two groups within clusters

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment