Hello Statlisters,
I need clarification regarding an IV-Decomposition model. I have separately estimated an Instrumental Variables (IV) regression, and now I want to decompose the results using the Oaxaca-Blinder technique to examine wage differentials between men and women.
To estimate the IV-Decomposition, I first obtained the predicted values from the IV regression and used "school_hat" in the Oaxaca-Blinder decomposition as such;
Ivregress 2sls logwage (school=ube_north ube_south) age agesq p_educ educ_qual female, vce (cluster hhid) first
Predict school_hat, xb
Oaxaca logwage school_hat age agesq p_educ educ_qual, by(female) vce(cluster hhid)
However, I have concerns regarding the sample size. I expected the sample size to remain the same as in the original IV estimation (5,400 obs). However, the IV-Decomposition appears to capture the full sample size (8,000 obs). I suspect this discrepancy is because I grouped the Oaxaca-Blinder model by female which is an explanatory variable in the original IV model. Could this be introducing additional observations that were not included in the IV estimation?
Despite this, the results appear statistically and economically reasonable. However, I am concerned about the sample size. Am I missing something in how the sample is selected for Oaxaca-Blinder decomposition after IV estimation? Any insights into why this is happening and whether it affects the validity of my results would be greatly appreciated.
Many Thanks
I need clarification regarding an IV-Decomposition model. I have separately estimated an Instrumental Variables (IV) regression, and now I want to decompose the results using the Oaxaca-Blinder technique to examine wage differentials between men and women.
To estimate the IV-Decomposition, I first obtained the predicted values from the IV regression and used "school_hat" in the Oaxaca-Blinder decomposition as such;
Ivregress 2sls logwage (school=ube_north ube_south) age agesq p_educ educ_qual female, vce (cluster hhid) first
Predict school_hat, xb
Oaxaca logwage school_hat age agesq p_educ educ_qual, by(female) vce(cluster hhid)
However, I have concerns regarding the sample size. I expected the sample size to remain the same as in the original IV estimation (5,400 obs). However, the IV-Decomposition appears to capture the full sample size (8,000 obs). I suspect this discrepancy is because I grouped the Oaxaca-Blinder model by female which is an explanatory variable in the original IV model. Could this be introducing additional observations that were not included in the IV estimation?
Despite this, the results appear statistically and economically reasonable. However, I am concerned about the sample size. Am I missing something in how the sample is selected for Oaxaca-Blinder decomposition after IV estimation? Any insights into why this is happening and whether it affects the validity of my results would be greatly appreciated.
Many Thanks
Comment