Dear STATA list,
happy Monday.
I am interested in estimating the effect of language proficiency (measured by "very good german command") for immigrants in Germany. To this end, I run linear regression models of wages on demographics, including immigrant status, a dummy that captures "very good German proficiency" and other demographic characteristics. To enrich the specification, I also include interaction terms, subsequently interacting immigrant status x education, immigrant status x very_good_german, education x very_good_german and finally, also including a triple interaction effect.
My code is as follows:
The problem I am facing is the following. In my data, I have natives and immigrants. Natives do not report their language proficiency, hence, I impute the value "very_good_ger_command == 1" for natives. That is, I assume they have a very good german proficiency. When running the regression loop as above, STATA drops the interaction effects because of collinearity.
For example, when running the fourth specification (reg log wages on x4), I get:
I believe this problem arises because of my imputation for all natives, the category "immigrant = 0, very_good_ger_command = 0" does not exist.
I would like to set the base category as natives with good german command, that is, immig = 0, very_good_ger_command = 1. However, doing this using the method ib#. does not work for me.
Is my problem a syntax issue (of specifying base levels)?
Or is there something fundamentally wrong with my specification, and do I perhaps need to drop the variable "very_good_ger_command" as a stand-alone so my regression can be identified?
I am grateful for any input or tips. Thank you!
happy Monday.
I am interested in estimating the effect of language proficiency (measured by "very good german command") for immigrants in Germany. To this end, I run linear regression models of wages on demographics, including immigrant status, a dummy that captures "very good German proficiency" and other demographic characteristics. To enrich the specification, I also include interaction terms, subsequently interacting immigrant status x education, immigrant status x very_good_german, education x very_good_german and finally, also including a triple interaction effect.
My code is as follows:
Code:
local Y ln_wages_gro global x1 immigrant##i.educ_level sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined global x2 immigrant##i.educ_level immigrant##very_good_ger_command sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined global x3 immigrant##i.educ_level immigrant##very_good_ger_command educ_level#very_good_ger_command sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined global x4 immigrant##i.educ_level##very_good_ger_command sex age age_sq married no_children i.educ_level years_work_exp i.occup_combined // triple interaction effect local X x1 x2 x3 x4 *Loop regressions using reghdfe foreach y of local Y { foreach x of local X { eststo reg_`y'_`x': reghdfe `y' ${`x'} , /// absorb(cluster_var syear) vce(cluster cluster_var) } }
For example, when running the fourth specification (reg log wages on x4), I get:
Code:
note: 0b.immigrant#0b.very_good_ger_command omitted because of collinearity note: 0b.immigrant#1b.educ_level#0b.very_good_ger_command omitted because of collinearity note: 1o.immigrant#3o.educ_level#0b.very_good_ger_command omitted because of collinearity note: 889.occup_combined omitted because of collinearity HDFE Linear regression Number of obs = 367,217 Absorbing 2 HDFE groups F( 30, 83) = 2930.43 Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.4914 Adj R-squared = 0.4912 Within R-sq. = 0.4422 Number of clusters (cluster_var) = 84 Root MSE = 0.6110 (Std. err. adjusted for 84 clusters in cluster_var) ------------------------------------------------------------------------------------------------------------ | Robust ln_wages_gro | Coefficient std. err. t P>|t| [95% conf. interval] -------------------------------------------+---------------------------------------------------------------- 1.immigrant | .0204051 .025504 0.80 0.426 -.0303214 .0711315 | educ_level | 2 | .1357691 .0266024 5.10 0.000 .082858 .1886801 3 | .3297254 .0292414 11.28 0.000 .2715655 .3878854 | immigrant#educ_level | 1 2 | -.0371757 .0263592 -1.41 0.162 -.089603 .0152516 1 3 | -.1328072 .0353139 -3.76 0.000 -.2030451 -.0625693 | 1.very_good_ger_command | -.1049976 .0209216 -5.02 0.000 -.1466099 -.0633854 | immigrant#very_good_ger_command | 0 0 | 0 (empty) 1 1 | 0 (omitted) | educ_level#very_good_ger_command | 2 1 | .1636651 .0269149 6.08 0.000 .1101325 .2171977 3 1 | .2293729 .0270932 8.47 0.000 .1754856 .2832602 | immigrant#educ_level#very_good_ger_command | 0 1 0 | 0 (empty) 0 2 0 | 0 (empty) 0 3 0 | 0 (empty) 1 2 1 | 0 (omitted) 1 3 1 | 0 (omitted) | sex | .3147661 .0136618 23.04 0.000 .2875934 .3419388 age | .0699323 .003348 20.89 0.000 .0632732 .0765913 age_sq | -.0974664 .0034906 -27.92 0.000 -.104409 -.0905238 married | -.0496023 .0130072 -3.81 0.000 -.0754732 -.0237315 no_children | -.0253526 .0028548 -8.88 0.000 -.0310306 -.0196746 years_work_exp | .029932 .0006777 44.17 0.000 .0285841 .0312799 | occup_combined | 82 | -.1641463 .0176071 -9.32 0.000 -.1991662 -.1291265 83 | -.4081211 .0177827 -22.95 0.000 -.4434902 -.372752 84 | -.5554699 .0182121 -30.50 0.000 -.591693 -.5192467 85 | -.8531703 .0193025 -44.20 0.000 -.8915621 -.8147785 86 | -.7304313 .0402182 -18.16 0.000 -.8104237 -.6504389 87 | -.725796 .0225927 -32.13 0.000 -.7707321 -.68086 88 | -.745428 .0276917 -26.92 0.000 -.8005057 -.6903503 89 | -1.116623 .0284847 -39.20 0.000 -1.173278 -1.059968 881 | .7054542 .0220244 32.03 0.000 .6616486 .7492599 882 | .7558083 .0269996 27.99 0.000 .7021071 .8095094 883 | .5254447 .0224478 23.41 0.000 .480797 .5700924 884 | .4204903 .0270373 15.55 0.000 .3667142 .4742664 885 | .1750119 .0280122 6.25 0.000 .1192968 .230727 886 | .1494267 .0320236 4.67 0.000 .085733 .2131205 887 | .2900686 .0192637 15.06 0.000 .2517539 .3283832 888 | .3160587 .0176544 17.90 0.000 .2809449 .3511726 889 | 0 (omitted) | _cons | 5.479861 .083596 65.55 0.000 5.313592 5.64613
I would like to set the base category as natives with good german command, that is, immig = 0, very_good_ger_command = 1. However, doing this using the method ib#. does not work for me.
Is my problem a syntax issue (of specifying base levels)?
Or is there something fundamentally wrong with my specification, and do I perhaps need to drop the variable "very_good_ger_command" as a stand-alone so my regression can be identified?
I am grateful for any input or tips. Thank you!