imputation for missing values and point estimate when you have a variable which is completely bracketed

gibson.mudiriza

Join Date: Apr 2014
Posts: 3

imputation for missing values and point estimate when you have a variable which is completely bracketed

11 Apr 2014, 17:21

lm a new user of stata and forced to jump in to bigger issues already. lm not sure about the imputations l did so just need someone to confirm for me its its correct. this is how my variable is.

Income	Freq. Percent	Cum.

1. no income	1,136,644 49.06	49.06
2. R1 - R400	213,710.9 9.22	58.28
3. R401 - R800	362,154.39 15.63	73.91
4. R801 - R1600	199,897.98 8.63	82.54
5. R1601 - R3200	172,102.47 7.43	89.97
6. R3201 - R6400	125,421.5 5.41	95.38
7. R6401 - R12 800	66,210.959 2.86	98.24
8. R12 801 - R25 600	25,469.541 1.10	99.34
9. R25 601 - R51 200	8,683.8753 0.37	99.72
10. R51 201 - R102 400	3,364.3909 0.15	99.86
11. R102 401 - R204 800	2,186.1542 0.09	99.96
12. R204 801+	1,028.276 0.04	100.00

Total	2,316,874 100.00

To impute for missing this is what l did.
/*Redefing zero income: set equal to
1. missing if individual is age>=15, zero if age<15
2. missing if age<15& income>6
3. missing if income==0& employed==1
my actual imputation

mi set mlong
mi register impute income
mi impute ologit income age i.province i.industry hours_worked i.occupation ///
i.educ_grpd gender race residence employed i.married i.citizen, add(10) force
mi estimate, saving(miest,replace): ologit income age i.province i.industry ///
hours_worked i.occupation i.educ_grpd gender race residence employed ///
i.married i.citizen
mi predictnl income_hat = predict(xb) using miest if employed==1
gen income_imputed=1 if employed==1&income==.&income_hat!=.
replace income_imputed=0 if income_imputed!=1
replace income=income_hat if income_imputed==1

After this l want to impute for point estimate for income to get a continuous variable and lm not sure how to go about it.
from the literature this is what l picked
1. generate a CDF for different distributions like normal, pareto,uniform and lognormal for each band.
2. then generate random probabilities for each individual
3. then assign income such that the cumulative probability of observing such a value from the distributions is >= to generated probability.
lm not sure about these 3 steps.

lm not doing a study on this income variable going to use it as my dependent variable.

Sorry its a bit long
thank you

Tags: None

Announcement

imputation for missing values and point estimate when you have a variable which is completely bracketed