Specifying a complex hierarchical linear model

Mattia Gatti

Join Date: May 2023
Posts: 39

Specifying a complex hierarchical linear model

19 Nov 2024, 06:53

Dear all,

I am working with the following cross-sectional data, stacked by domain and id:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str26 countryname float id long domain byte(sex age education) float(voted_party_repgapcat voted_party_blur voted_party_voteprev_el voted_party_mainstream voted_closest_dyad closest_party_voteprev_el closest_party_blur n_parties polarization)
"Austria"  1 1 1 2 3 2  .7063358 29.26 1 4     .   .322638 5.157324  3.087811
"Austria"  1 4 1 2 3 2        .5 29.26 1 1 25.98  .8750908 5.157324  4.897043
"Austria"  1 3 1 2 3 2 .06666667 29.26 1 4     .  .3629214 5.157324  4.455237
"Austria"  1 5 1 2 3 3  .7498868 29.26 1 1 25.98         1 5.157324 4.4996285
"Austria"  1 2 1 2 3 2         1 29.26 1 4 10.43  .9784539 5.157324   5.33847
"Austria"  2 1 2 3 2 2  .7063358 29.26 1 4     .   .322638 5.157324  3.087811
"Austria"  2 2 2 3 2 2         1 29.26 1 4 10.43  .9784539 5.157324   5.33847
"Austria"  2 5 2 3 2 1  .7498868 29.26 1 4 10.43  .9685511 5.157324 4.4996285
"Austria"  2 3 2 3 2 2 .06666667 29.26 1 4     .  .3629214 5.157324  4.455237
"Austria"  2 4 2 3 2 2        .5 29.26 1 1 17.54         1 5.157324  4.897043
"Austria"  3 4 2 2 3 1         1     . 0 2 25.98  .8750908 5.157324  4.897043
"Austria"  3 5 2 2 3 3  .3684777     . 0 2 29.26  .7498868 5.157324 4.4996285
"Austria"  3 2 2 2 3 3   .569599     . 0 2 29.26         1 5.157324   5.33847
"Austria"  3 3 2 2 3 2  .8798283     . 0 3 10.43  .8763679 5.157324  4.455237
"Austria"  3 1 2 2 3 2   .322638     . 0 3     . .02127939 5.157324  3.087811
"Austria"  4 5 2 3 3 3  .3684777     . 0 2 29.26  .7498868 5.157324 4.4996285
"Austria"  4 4 2 3 3 3         1     . 0 3     .   .918285 5.157324  4.897043
"Austria"  4 2 2 3 3 3   .569599     . 0 2 29.26         1 5.157324   5.33847
"Austria"  4 3 2 3 3 2  .8798283     . 0 2 25.98 .29010496 5.157324  4.455237
"Austria"  4 1 2 3 3 2   .322638     . 0 3     . .02127939 5.157324  3.087811
"Austria"  5 2 2 3 1 3  .5237939 17.54 1 1 29.26         1 5.157324   5.33847
"Austria"  5 1 2 3 1 3  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria"  5 4 2 3 1 1         1 17.54 1 1 17.54         1 5.157324  4.897043
"Austria"  5 5 2 3 1 1         1 17.54 1 1 25.98         1 5.157324 4.4996285
"Austria"  5 3 2 3 1 3         1 17.54 1 1 25.98 .29010496 5.157324  4.455237
"Austria"  6 3 2 5 2 1 .06666667 29.26 1 1 25.98 .29010496 5.157324  4.455237
"Austria"  6 2 2 5 2 2         1 29.26 1 1 29.26         1 5.157324   5.33847
"Austria"  6 1 2 5 2 1  .7063358 29.26 1 1 29.26  .7063358 5.157324  3.087811
"Austria"  6 4 2 5 2 1        .5 29.26 1 1 29.26        .5 5.157324  4.897043
"Austria"  6 5 2 5 2 1  .7498868 29.26 1 1 29.26  .7498868 5.157324 4.4996285
"Austria"  7 2 1 6 2 2  .5237939 17.54 1 4     .  .3265229 5.157324   5.33847
"Austria"  7 4 1 6 2 2         1 17.54 1 4     .   .918285 5.157324  4.897043
"Austria"  7 1 1 6 2 3  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria"  7 5 1 6 2 2         1 17.54 1 4     .         1 5.157324 4.4996285
"Austria"  7 3 1 6 2 3         1 17.54 1 4 10.43  .8763679 5.157324  4.455237
"Austria"  8 3 2 2 1 1         1 17.54 1 1 17.54         1 5.157324  4.455237
"Austria"  8 1 2 2 1 3  .1666621 17.54 1 4     .   .322638 5.157324  3.087811
"Austria"  8 2 2 2 1 2  .5237939 17.54 1 1 25.98  .8562965 5.157324   5.33847
"Austria"  8 4 2 2 1 1         1 17.54 1 4     .   .918285 5.157324  4.897043
"Austria"  8 5 2 2 1 3         1 17.54 1 1 29.26  .7498868 5.157324 4.4996285
"Austria"  9 1 1 4 2 3  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria"  9 4 1 4 2 2         1 17.54 1 4     .   .918285 5.157324  4.897043
"Austria"  9 2 1 4 2 3  .5237939 17.54 1 1 29.26         1 5.157324   5.33847
"Austria"  9 3 1 4 2 1         1 17.54 1 1 17.54         1 5.157324  4.455237
"Austria"  9 5 1 4 2 2         1 17.54 1 4     .         1 5.157324 4.4996285
"Austria" 10 5 1 4 2 1         1 17.54 1 4     .         1 5.157324 4.4996285
"Austria" 10 1 1 4 2 3  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria" 10 4 1 4 2 2         1 17.54 1 4     .   .918285 5.157324  4.897043
"Austria" 10 2 1 4 2 3  .5237939 17.54 1 4 10.43  .9784539 5.157324   5.33847
"Austria" 10 3 1 4 2 1         1 17.54 1 1 17.54         1 5.157324  4.455237
"Austria" 11 4 1 2 2 2        .5 29.26 1 1 17.54         1 5.157324  4.897043
"Austria" 11 5 1 2 2 3  .7498868 29.26 1 1 17.54         1 5.157324 4.4996285
"Austria" 11 2 1 2 2 1         1 29.26 1 1 29.26         1 5.157324   5.33847
"Austria" 11 3 1 2 2 1 .06666667 29.26 1 1 25.98 .29010496 5.157324  4.455237
"Austria" 11 1 1 2 2 3  .7063358 29.26 1 4 10.43  .3053001 5.157324  3.087811
"Austria" 12 2 1 4 2 2         1 29.26 1 1 29.26         1 5.157324   5.33847
"Austria" 12 4 1 4 2 2        .5 29.26 1 1 17.54         1 5.157324  4.897043
"Austria" 12 3 1 4 2 3 .06666667 29.26 1 4 10.43  .8763679 5.157324  4.455237
"Austria" 12 5 1 4 2 1  .7498868 29.26 1 1 29.26  .7498868 5.157324 4.4996285
"Austria" 12 1 1 4 2 3  .7063358 29.26 1 1 17.54  .1666621 5.157324  3.087811
"Austria" 13 5 2 2 2 1  .7498868 29.26 1 4 10.43  .9685511 5.157324 4.4996285
"Austria" 13 1 2 2 2 3  .7063358 29.26 1 4 10.43  .3053001 5.157324  3.087811
"Austria" 13 2 2 2 2 3         1 29.26 1 1 25.98  .8562965 5.157324   5.33847
"Austria" 13 3 2 2 2 3 .06666667 29.26 1 4 10.43  .8763679 5.157324  4.455237
"Austria" 13 4 2 2 2 2        .5 29.26 1 1 25.98  .8750908 5.157324  4.897043
"Austria" 14 5 1 6 2 3  .7498868 29.26 1 4     .         1 5.157324 4.4996285
"Austria" 14 4 1 6 2 2        .5 29.26 1 4     .         1 5.157324  4.897043
"Austria" 14 2 1 6 2 3         1 29.26 1 1 25.98  .8562965 5.157324   5.33847
"Austria" 14 3 1 6 2 1 .06666667 29.26 1 1 25.98 .29010496 5.157324  4.455237
"Austria" 14 1 1 6 2 2  .7063358 29.26 1 4     .   .322638 5.157324  3.087811
"Austria" 15 4 2 2 1 3  .8750908 25.98 1 4     .   .918285 5.157324  4.897043
"Austria" 15 1 2 2 1 1  .0978288 25.98 1 4     . .02127939 5.157324  3.087811
"Austria" 15 2 2 2 1 3  .8562965 25.98 1 1 29.26         1 5.157324   5.33847
"Austria" 15 5 2 2 1 3         1 25.98 1 1 29.26  .7498868 5.157324 4.4996285
"Austria" 15 3 2 2 1 3 .29010496 25.98 1 1 17.54         1 5.157324  4.455237
"Austria" 16 3 2 5 2 1         1 17.54 1 1 17.54         1 5.157324  4.455237
"Austria" 16 4 2 5 2 2         1 17.54 1 4     .   .918285 5.157324  4.897043
"Austria" 16 2 2 5 2 3  .5237939 17.54 1 1 29.26         1 5.157324   5.33847
"Austria" 16 5 2 5 2 2         1 17.54 1 4     .         1 5.157324 4.4996285
"Austria" 16 1 2 5 2 3  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria" 17 3 1 1 4 3         1 17.54 1 4 10.43  .8763679 5.157324  4.455237
"Austria" 17 2 1 1 4 2  .5237939 17.54 1 4     .  .3265229 5.157324   5.33847
"Austria" 17 1 1 1 4 3  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria" 17 5 1 1 4 3         1 17.54 1 1 29.26  .7498868 5.157324 4.4996285
"Austria" 17 4 1 1 4 2         1 17.54 1 1 29.26        .5 5.157324  4.897043
"Austria" 18 2 2 6 2 2  .8562965 25.98 1 4 10.43  .9784539 5.157324   5.33847
"Austria" 18 5 2 6 2 1         1 25.98 1 1 25.98         1 5.157324 4.4996285
"Austria" 18 3 2 6 2 3 .29010496 25.98 1 4 10.43  .8763679 5.157324  4.455237
"Austria" 18 4 2 6 2 2  .8750908 25.98 1 1 29.26        .5 5.157324  4.897043
"Austria" 19 5 1 4 1 3         1 17.54 1 1 29.26  .7498868 5.157324 4.4996285
"Austria" 19 1 1 4 1 1  .1666621 17.54 1 1 17.54  .1666621 5.157324  3.087811
"Austria" 19 3 1 4 1 3         1 17.54 1 4 10.43  .8763679 5.157324  4.455237
"Austria" 19 4 1 4 1 2         1 17.54 1 4     .   .918285 5.157324  4.897043
"Austria" 19 2 1 4 1 3  .5237939 17.54 1 1 29.26         1 5.157324   5.33847
"Austria" 20 4 1 2 3 3  .8750908 25.98 1 4     .   .918285 5.157324  4.897043
"Austria" 20 5 1 2 3 1         1 25.98 1 4 10.43  .9685511 5.157324 4.4996285
"Austria" 20 1 1 2 3 2  .0978288 25.98 1 1 17.54  .1666621 5.157324  3.087811
"Austria" 20 3 1 2 3 2 .29010496 25.98 1 1 17.54         1 5.157324  4.455237
"Austria" 20 2 1 2 3 1  .8562965 25.98 1 1 25.98  .8562965 5.157324   5.33847
"Austria" 21 5 1 4 2 1         1 25.98 1 1 25.98         1 5.157324 4.4996285
end
label values domain domain3
label def domain3 1 "lr", modify
label def domain3 2 "redistribution", modify
label def domain3 3 "immigration", modify
label def domain3 4 "eu", modify
label def domain3 5 "gender", modify
label values sex D10
label def D10 1 "Male", modify
label def D10 2 "Female", modify
label values age D11R2
label def D11R2 1 "16/18-24", modify
label def D11R2 2 "25-34", modify
label def D11R2 3 "35-44", modify
label def D11R2 4 "45-54", modify
label def D11R2 5 "55-64", modify
label def D11R2 6 "65+", modify
label values education D8
label def D8 1 "15-", modify
label def D8 2 "16-19", modify
label def D8 3 "20+", modify
label def D8 4 "Still Studying", modify
label values voted_party_repgapcat voted_party_repgaplr
label def voted_party_repgaplr 1 "Low", modify
label def voted_party_repgaplr 2 "Medium", modify
label def voted_party_repgaplr 3 "High", modify
label values voted_party_mainstream vote_clos
label def vote_clos 0 "No", modify
label def vote_clos 1 "Yes", modify
label values voted_closest_dyad dyads
label def dyads 1 "Mainstream/Mainstream", modify
label def dyads 2 "Challenger/Mainstream", modify
label def dyads 3 "Challenger/Challenger", modify
label def dyads 4 "Mainstream/Challenger", modify

As you can see, individuals are also nested in countries. My dependent variable is voted_party_repgapcat which varies across domains within the id and between id. The ind. vars vary at different levels. Socio-demos (sex, age, education), voted_party_blur and voted_party_voteprev_el are fixed within the same id but vary across individuals and countries. Other predictors such as voted_closest_dyad closest_party_voteprev_el closest_party_blur vary also within id (so across idxdomain). Polarization is domain-specific, i.e. that is varies within the same id but it takes the same values for individuals in the same country. Finally, n_parties varies only across countries.

I was suggested to use a cross-nested model but I am not very familiar with that.

I would really appreciate your suggestions on the most appropriate specification and code to employ.

Sincerely
Mattia

Tags: None

Mattia Gatti

Join Date: May 2023

Posts: 39
#2

19 Nov 2024, 06:54

Actually, sorry for the typo. My dependent variable is voted_party_repgap (not shown here), so continuous, not a categorical one.

Sincerely
M
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 386
#3

19 Nov 2024, 08:42

Hello Mattia,

Your data example, which is helpful, only contains one country, so we cannot use this data to fit the same model that you would build on with the full data. But that is ok. You have a couple of options for how to fit this model. One approach is to stick to a three level model and treat domain as a fixed factor:

Code:

mixed voted_party_repgap i.domain || countryname: || id:

Since you only have 5 domains, this might be the most prudent model as mixed effects models with data such as this produce random effect variances that have a lot of uncertainty associated with them.

If you wanted to fit the full crossed random effects specification, you would use the following syntax:

Code:

mixed voted_party_repgap || _all: r.domain || countryname: || id:

The advantage to fitting the model this way is that you get a variance estimate for domain that can be compared to the variance estimates for country and id. This allows you to see the relative contribution to the DV of each random effect.

Both models deal with the cross-classification, which means that the domain effect is treated as the same (systematic) for each country and id. The fixed model is probably the easier way to deal with it. Note that if you did not model domain at all, it's effect would be unique to each id and country and would be subsumed in the residual.
Comment
Mattia Gatti

Join Date: May 2023

Posts: 39
#4

19 Nov 2024, 09:49

Dear Erik,

thanks a lot for your reply. Yes, I included only one country in the example though I have 16.

If I understood correctly, in the first option, we are considering the domain effect to be fixed, while we specify random intercepts at id and country level. At the same time, since domain is not only a blocking factor but it features in my hypotheses, I guess the first specification is the most appropriate.
Another complexity I forgot to mention is the unbalanced nature of my data. 70% of my IDs contain 5 obs, 25% 4 obs, 5% 3, and the remaining part is either 2 or 1. How to account for that?

Further, while the overall specification is clear, I have some doubts on the specific variables and the levels to which they pertain to. For example, sex age education are fixed within IDs but vary between IDs, so if I am correct, I should specify random coefficients on those variables, after

Code:

|| id:

. At the same time, my hypotheses pertain also to the socio-demographics. How can I test for them only by including these predictors as random coefficients?

Finally, a variable such as polarization varies across domains and countries but it is fixed across individuals. What would be the best specification for that?

Sorry for the whole load of questions, and thanks!

Sincerely
Mattia
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 386
#5

19 Nov 2024, 17:29

I will try to respond to each of your questions, below.

Another complexity I forgot to mention is the unbalanced nature of my data. 70% of my IDs contain 5 obs, 25% 4 obs, 5% 3, and the remaining part is either 2 or 1. How to account for that?

This isn't really a problem for the model. For your results, you will have larger standard errors associated with the domain coefficients, which will be magnified for those domains with less data than others. One thing you should consider is whether this missingness is systematic in some way, and if so, include covariates that explain it.

For example, sex age education are fixed within IDs but vary between IDs, so if I am correct, I should specify random coefficients on those variables...At the same time, my hypotheses pertain also to the socio-demographics. How can I test for them only by including these predictors as random coefficients?

You need to tell us your hypotheses. Without this information, it is hard to tell whether you should allow these variables to have varying associations with the outcome (random slopes). No matter your hypotheses, you may want to include their respective country means as additional non-varying (fixed) predictors to help deal with any potential endogeneity.

Finally, a variable such as polarization varies across domains and countries but it is fixed across individuals. What would be the best specification for that?

Again, what is the hypothesis you want to test about polarization? Or is it just a covariate?
Comment
Mattia Gatti

Join Date: May 2023

Posts: 39
#6

20 Nov 2024, 03:55

Dear Erik, thanks a lot again.

First point, clear!

Concerning the second point, my hypotheses are the following:
H1: Ceteris paribus, women have larger individual repr.gaps than men
H2: Ceteris paribus, younger voters have larger individual repr.gaps than older voters
H3: Ceteris paribus, the higher the income, the smaller the individual repr.gap.

As for their country means, I suppose this should go in the fixed effects equation

Concerning the third point:
Hx: Ceteris paribus, the greater the party-system polarization, the smaller the individual repr.gap.

What you said about allowing these variables to have varying associations with the outcome or not based on being focal variables or covariates is very interesting. So, I guess that the distinction between explanatory variable of interest/covariate also impacts how I specify those in the model.

Sincerely
Mattia
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 386
#7

21 Nov 2024, 08:16

Hi Mattia,

Sorry for the delayed reply. All of your hypotheses are about individual predictors associations with the outcome. They do not involve the possibility that these associations vary across higher level units. That is the essence of random slopes - you are interested in an association between a predictor and outcome and whether it varies across clusters (id and/or countries in your case). Often in these models, those varying associations are thought to be explained by cluster level covariates, which necessitates interactions in the fixed effects part of the model.

For example, imagine that you hypothesized that women's voting patterns varied based on a country level factor (e.g., degree of authoritarianism, which I don't think you measure but let's pretend you did). Then the model to test that hypothesis is the following:

Code:

mixed voted_party_repgap i.sex##c.authoritariansim.i.domain || countryname: sex, cov(unstructured) || id:

These models are very flexible. You can think of each random effect as it's own outcome. In your model, you can use time-varying predictors to explain within-id variance (residual), id level predictors to explain id-level variance, country level predictors to explain country-level variance, and country level predictors to explain id level random slopes. The trick is to remember that because each random effect is its own outcome, you need to deal with confounding for each of them that you aim to predict. That, plus endogeneity (correlation between the random effect and lower-level predictors) is perhaps why they are not the first tool of choice for panel data among econometricians.

Last edited by Erik Ruzek; 21 Nov 2024, 08:22. Reason: Fixed syntax
Comment

Announcement

Specifying a complex hierarchical linear model

Comment

Comment

Comment

Comment

Comment

Comment