Country dummies for multilevel analysis

Celia Shi

Join Date: Mar 2016

Posts: 5
#1

Country dummies for multilevel analysis

05 Jan 2025, 08:17

Hello, everyone! I am running a multilevel analysis with individuals (level 1) nested in countries (level 2). I do not have any level 2 predictors except for the countries itself. Should I include country dummies into the model or I do not include any level 2 covariates? Thanks!

BTW, after including country dommies, the country-level variance is nearly zero. Does it mean I should not include the country dummies as fixed effects?
Tags: None
Bruce Weaver

Join Date: May 2014

Posts: 1106
#2

05 Jan 2025, 11:14

It sounds as if you want to estimate a random intercept (or maybe random coefficients) model, with country as the cluster variable. For a random intercept model, your code would look something like this:

Code:

mixed DV {fixed predictors} || country:, options

And country would not be included among the fixed predictors.

Q. How many countries are there in your dataset?

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Celia Shi

Join Date: Mar 2016

Posts: 5
#3

05 Jan 2025, 22:48

Hi Bruce, thanks for your response! It is helpful. I have 20 countries. I know it is not a large number but it still qualify for a multilevel analysis?
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1106
#4

06 Jan 2025, 08:09

Hello Celia Shi. See this simulation study by Maas & Hox (2005). Here is an excerpt from the Summary & Discussion section.
Summing up, both the regression coefficients and the variance components are all estimated without bias, in all of the simulated conditions. The standard errors of the regression coefficients are also estimated accurately, in all of the simulated conditions. The standard errors of the second-level variances are estimated too small when the number of groups is substantially lower than 100. With 30 groups, the standard errors are estimated about 15% too small, resulting in a non-coverage rate of almost 8.9%, instead of 5%. With 50 groups, the non-coverage drops to about 7.3%. This is clearly different from the nominal 5%, but in practice probably acceptable.
I hope this helps.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Celia Shi

Join Date: Mar 2016

Posts: 5
#5

09 Jan 2025, 01:11

Originally posted by Bruce Weaver View Post

Hello Celia Shi. See this simulation study by Maas & Hox (2005). Here is an excerpt from the Summary & Discussion section.
Summing up, both the regression coefficients and the variance components are all estimated without bias, in all of the simulated conditions. The standard errors of the regression coefficients are also estimated accurately, in all of the simulated conditions. The standard errors of the second-level variances are estimated too small when the number of groups is substantially lower than 100. With 30 groups, the standard errors are estimated about 15% too small, resulting in a non-coverage rate of almost 8.9%, instead of 5%. With 50 groups, the non-coverage drops to about 7.3%. This is clearly different from the nominal 5%, but in practice probably acceptable.
I hope this helps.

Thank you for bringing up this simulation study! The conclusion is very helpful. I have a follow-up question: Is it appropriate to control for country dummies if I hypothesize that there will be differences between countries?
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 396
#6

09 Jan 2025, 17:07

If you want to run a mixed model with 20 groups, then you want to use restricted maximum likelihood estimation with the Kenward-Roger degrees of freedom correction, which provides more appropriate standard errors than full maximum likelihood. Building off Bruce's syntax:

Code:

mixed DV {fixed predictors} || country:, reml dfmethod(kroger)

In terms of using country dummy variables (reg DV predictor1 predictor2 i.country) instead of a mixed model, that is more a decision based on your field and your desired effect of interest. With the mixed models, predictors that have within and between country variation end up giving you a coefficient that is a blend of those two types of associations. When you use country dummies (i.e., fixed effects), then the coefficients give you within-country associations. Variables that only vary between countries cannot be estimated in the fixed effects model whereas they can be in the mixed model. There are some caveats about endogeneity with mixed models that you should be aware of. There are plenty of discussions on Statalist about this.
1 like
Comment

Announcement

Country dummies for multilevel analysis

Comment

Comment

Comment

Comment

Comment