Mixed effects (multilevel) model vs. cluster command

Francis Vergunst replied

28 Nov 2018, 10:10
Thank you for these clear and detailed responses.

One reason for my question is that I want to apply the above model to a categorical outcome with 3 levels (i.e. multinomial logistic regression/mlogit), but from what I’ve read, Stata doesn’t have a dedicated command for this and it can only be done using the gsem command. Also, to further complicate things, I need to do this within a multiple imputations framework, which I’ve read does not work for sem in Stata.

The model I want to run is this: mi estimate: ?command 3_level_outcome pred1 pred2 etc || school: || class:

What options do I have? Are there alternative to gsem for this kind of problem?
Leave a comment:
Weiwen Ng replied

27 Nov 2018, 12:01
In addition to what Clyde said, I have some minor points.

1) The sandwich estimator of the variance is robust to violations of independence caused by clustering. (i.e. the -vce(cluster clustervar)- option; the -vce(robust)- option is robust to violation of heteroskedasticity and is similar but not the same). I believe that estimator relies on having a large number of clusters to achieve its goals (I hope someone will correct me if this is wrong). If you have few clusters, it won't work as well, and it might be better to explicitly model that.

2) The original post alludes to two levels of clustering - classes, which are nested in schools. That is a situation where you'd default to -mixed-.

3) With -mixed-, you can explicitly model the proportion of variance that's attributable to within-cluster variation, and between-cluster variation. Often, this is of substantive interest.

4) Another option to be aware of is -xtreg, fe-, which uses fixed effects for the clusters. However, it only handles one level of clustering. Economists tend to prefer fixed effects, arguing that they provide unbiased estimates of the coefficients. Other disciplines are not as concerned about this. I mention it for completeness. Seeing as you have two levels of clustering, this won't be a perfect fit for your purposes.

5) The correct term is the -vce(cluster ...)- option. There is a separate set of cluster analysis commands, which do something very different.
1 like
Leave a comment:
Clyde Schechter replied

27 Nov 2018, 11:37
What you are calling "the cluster command" is not that. It is simply the use of cluster robust standard errors with -regress-. The distinction is important because Stata does, in fact, have a -cluster- command and what it does is unrelated to the problem you are working with.

I would strongly prefer the use of the -mixed- model here. Yes it is, in a sense, a regular regression with adjustments made to the standard errors, but the adjustments are better than those provided by -vce(cluster ...)- when you really have hierarchical data. The -regress- approach, even with -vce(cluster ...)- does not adjust for potential confounding due to systematic differences among classes or schools. The -mixed- model does so.

The only circumstance where I would take -regress- over -mixed- is if the intraclass correlations at the school and Class levels are very close to zero. In that case, -mixed- is telling you that there isn't really any systematic effect of class or school on the outcome (at least conditional on behav* etc.) and in that case -regress- would be fine, and the results would be essentially indistinguishable.

As for standardized betas, even assuming that this is one of those unusual situations where using them with -regress- would actually make sense (which I question), they make no sense at all with hierarchical data. It isn't even clear what standardization means in the context of hierarchical data. What standard deviation should be used: that of the overall estimation sample? that within-class , calculated separately for each class? the pooled within-class one? that within-school, calculated separately for each school? the pooled within-school one? How would you explain or justify whichever choice you made? How would anybody go about using or interpreting the results obtained with any of these choices?
3 likes
Leave a comment:

Announcement

Leave a comment:

Leave a comment:

Leave a comment: