Analysis of the significance of the error terms of the multilevel model

Jessica Rayse

Join Date: Aug 2019

Posts: 28
#1

Analysis of the significance of the error terms of the multilevel model

20 Oct 2024, 12:32

I'm using the multilevel step-up strategy in a research study. I estimated the null model, which indicated that the HLM model is better than the OLS model. I estimated a random intercept model, saved the estimation and then estimated a model with random intercepts and slopes. When I do the LR test comparing it with the model that has only random intercepts with the model with random intercepts and slopes, the LR test indicates that the model with intercepts and slopes is better. However, when I calculate the significance of the error terms, VAR1 is significant only at 10% and VAR4 is not significant. How do I deal with this trade off? Does the non-significant variance invalidate the model?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

20 Oct 2024, 13:33

Your "calculat[ion of] the significance of the error terms" is wrong in three key respects.
It is a general fact in statistics that a "statistically significant" result from a joint test, such as the likelihood ratio test comparing these models, does not imply that any of the individual tests it encompasses will also be "statistically significant." The joint test is by no means equivalent to the disjunction of the individual tests. This applies not just here but more generally: if you do a test of the significance of a multi-category discrete variable and it turns out statistically significant, it does not follow that any of the individual categories, when tested, will have a statistically significant result.

Your calculation of the significance of the individual tests, dividing the estimate by its standard error is incorrect in any case. These are variance components, and the sampling distribution of estimate/standard error is not normal, not even to a gross approximation. This is because, as variance components, they cannot be negative, and, in fact, zero, although theoretically a possible value, is not actually an obtainable value with the algorithms used to calculate them. (A zero variance component will lead to non-convergence of the estimation.) So zero is an edge-case here. The sampling distribution of mean/standard error for variance components turns out to be a mixture of two chi square distributions. This is the reason why in the random effects portion of the output Stata does not report z-statistics or corresponding p-values: they would be misleading.

The null hypothesis of zero variance in a random slope model, in addition to being an edge case, is in almost all real-world circumstances so far-fetched that testing a null hypothesis that the variance of the random-slope = 0 is pointless.

While I don't in general endorse using statistical tests to select models, if you are going to go down that path, at least use the correct approach. The correct approach here is the likelihood ratio test, as you have used it initially. If you want to go farther in this direction to assess the inclusion of the individual variable random slopes for var1k, etc., then do separate likelihood ratio tests contrasting the random slopes model with all of them included against each of the models with one of them excluded.
2 likes
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 398
#3

20 Oct 2024, 13:35

Jessica,

You cannot treat the standard errors for the random effect estimates in the same way you treat standard errors of fixed effects. As noted in Rabe-Hesketh & Skrondal's Multilevel and Longitudinal Modeling using Stata, the random effect standard errors cannot be used to conduct test statistics or confidence intervals because the sampling distributions of the estimators often deviate from normality. This is especially true when you have very small variance estimates or few clusters. Unfortunately, you have both of these issues.

Likelihood ratio testing is the best option, although note that there are other ways to get standard errors and CIs that are more likely to behave in the way you would expect. This is R specific, but Ben Bolker goes through the various approaches here.

Just looking at the variance estimates in your output, it does not at first blush appear that var1 and var2 have much, if any, variability across clusters (gp_n2). However, it also appears that vari1 and var2 have different magnitudes than var3 and var4. I would suggest rescaling var1 and var2 so that they are more similar to var1 and var2 . That would mean dividing those variables by some relevant number.

You could also use a likelihood ratio test to compare the model you show with a model that excludes the random slopes for var1 and var2. Note that it is a strong assumption that the covariances between the random slopes and the random intercept are 0, which is what is specified in your model (gp_n2: Independent). You can relax this assumption by adding the option covariance(unstructured). Again, likelihood ratio tests can help you determine whether this provides a better fit to your data than the independent model.

Edit: Crossed with Clyde in #2!
1 like
Comment

Announcement

Analysis of the significance of the error terms of the multilevel model

Comment

Comment