Hi everyone,
I'm running a regression on the World Values Survey, which is individual-level survey data. The data files contain country-level aggregated variables, and the explanatory variable I am focussing on is a policy variable (gender inequality). I am running OLS and ordered logit regressions with country dummies and clustered standard errors (at country level), which is what a majority of my literature does. However, they tend to use repeated cross-sections wereas I am using only one wave of the data (and I am not sure if that makes a difference).
The issue I am having is when I run my regression with the full list of explanatory variables, stata is omitting 3/4 countries due to collinearity. I'm not clear why it is these countries specifically and any help would be appreciated. When I run the regression with no other country-level variables (so excluding GDP per capita, unemployment and the Gini coefficient) this seems to solve the issue. It is common in my literature to include these other country-level variables. Does this seem like an issue with the data (Gender inequality is a composite measure) or am I misspecifying my regressions?
My code is:
reg Q46 genderinequality $X job_scare election_equality home_equality political_equality_perception GDPpercap2 unemploytotal giniWB i.ISO31661numericcode, vce(cluster ISO31661numericcode)
and my output is:
I'm running a regression on the World Values Survey, which is individual-level survey data. The data files contain country-level aggregated variables, and the explanatory variable I am focussing on is a policy variable (gender inequality). I am running OLS and ordered logit regressions with country dummies and clustered standard errors (at country level), which is what a majority of my literature does. However, they tend to use repeated cross-sections wereas I am using only one wave of the data (and I am not sure if that makes a difference).
The issue I am having is when I run my regression with the full list of explanatory variables, stata is omitting 3/4 countries due to collinearity. I'm not clear why it is these countries specifically and any help would be appreciated. When I run the regression with no other country-level variables (so excluding GDP per capita, unemployment and the Gini coefficient) this seems to solve the issue. It is common in my literature to include these other country-level variables. Does this seem like an issue with the data (Gender inequality is a composite measure) or am I misspecifying my regressions?
My code is:
reg Q46 genderinequality $X job_scare election_equality home_equality political_equality_perception GDPpercap2 unemploytotal giniWB i.ISO31661numericcode, vce(cluster ISO31661numericcode)
and my output is:
Comment