Hello everyone,
I am working towards building a statistical model that predicts basic income preferences amongst voters using data from Wave 8 of the European Social Survey (2016). To control for country-level variation, I am using a multilevel logistic regression model, where observations at the level of individual respondents (level 1 units) are nested within countries (level 2 units). I have a fair amount of experience conducting regression analysis in STATA, but I am effectively a novice at multilevel modeling and lack a conceptual background in the technique.
The dependent variable is a binary indicator of whether a respondent supports or opposes basic income. I have three sets of independent variables that I am looking to test in a series of models: individual demographic variables (such as age and income,) individual attitudinal variables (various preferences about redistribution,) and country-level contextual variables (such as the level of means-tested welfare spending in the respondent's country.) The first two sets of variables represent level 1 units, while the third set of variables represent level 2 units. To be clear, I am looking to (1) isolate the effects of individual-level predictors while controlling for country-level heterogeneity as well as (2) estimate the effect of country-level predictors on basic income preferences.
1) I've read a decent amount of statistical literature suggesting that multilevel models with a small level-2 sample size leads to biased estimates for the level-2 standard errors. There are 21 countries included in my analysis. Will this pose an issue for the reliability of my country-level estimates? Is there another model or regression structure that I should consider in order to circumvent this problem? Some papers have suggested that a fixed effects model would allow me to estimate the “moderating effect” of a country-level variable through cross-level interactions. However, I am interested in measuring the direct effect of each country-level variable, particularly because I have many level 1 and level 2 variables, and it would be too time-constraining to measure moderating effects between a country-level variable and every individual-level variable.
2) Is it necessary to specify a random slope in the regression model? In what situation would it be appropriate to do so? Currently, the syntax of my base regression structure looks like
where there is no specification of a coefficient in the random effects syntax, introduced after the "||".
3) Is there any way to calculate or determine the explained variance of each of the models?
I am working towards building a statistical model that predicts basic income preferences amongst voters using data from Wave 8 of the European Social Survey (2016). To control for country-level variation, I am using a multilevel logistic regression model, where observations at the level of individual respondents (level 1 units) are nested within countries (level 2 units). I have a fair amount of experience conducting regression analysis in STATA, but I am effectively a novice at multilevel modeling and lack a conceptual background in the technique.
The dependent variable is a binary indicator of whether a respondent supports or opposes basic income. I have three sets of independent variables that I am looking to test in a series of models: individual demographic variables (such as age and income,) individual attitudinal variables (various preferences about redistribution,) and country-level contextual variables (such as the level of means-tested welfare spending in the respondent's country.) The first two sets of variables represent level 1 units, while the third set of variables represent level 2 units. To be clear, I am looking to (1) isolate the effects of individual-level predictors while controlling for country-level heterogeneity as well as (2) estimate the effect of country-level predictors on basic income preferences.
1) I've read a decent amount of statistical literature suggesting that multilevel models with a small level-2 sample size leads to biased estimates for the level-2 standard errors. There are 21 countries included in my analysis. Will this pose an issue for the reliability of my country-level estimates? Is there another model or regression structure that I should consider in order to circumvent this problem? Some papers have suggested that a fixed effects model would allow me to estimate the “moderating effect” of a country-level variable through cross-level interactions. However, I am interested in measuring the direct effect of each country-level variable, particularly because I have many level 1 and level 2 variables, and it would be too time-constraining to measure moderating effects between a country-level variable and every individual-level variable.
2) Is it necessary to specify a random slope in the regression model? In what situation would it be appropriate to do so? Currently, the syntax of my base regression structure looks like
Code:
melogit basicincome agea age2 gndr hinctnta eduyrs uemp5yr mbtru_curr mbtru_prev RTIscore [pw=pweight]|| cntry:
3) Is there any way to calculate or determine the explained variance of each of the models?
Comment