Hi,
Suppose I have the independent variables x1, x2, age and dependent variable y
I create a new variable x12 based on the values in x1 and x2
Now I want to predict y based on x3 and x12
y = x12 + x3
There is some missingness in x1 and x2, which resulted in missingness in x12. So I want to impute the values. However since x12 is designed based on x1 and x2, I would really like to impute x1 and x2 and then recalculate x12.
However I reading the STATA manual (https://www.stata.com/manuals13/mi.pdf) p.g. 156, where they were oulining that I can use the include option.
Can anyone confirm that this method would be correct in what I want to do?:
mi impute chained (regress (include(x1, x2)) x12 (regress) y = age
p.s. I created a second regress in the code above for y, as I assumed by doing it this way I won't include x1 and x2 in imputing y.
Suppose I have the independent variables x1, x2, age and dependent variable y
I create a new variable x12 based on the values in x1 and x2
Now I want to predict y based on x3 and x12
y = x12 + x3
There is some missingness in x1 and x2, which resulted in missingness in x12. So I want to impute the values. However since x12 is designed based on x1 and x2, I would really like to impute x1 and x2 and then recalculate x12.
However I reading the STATA manual (https://www.stata.com/manuals13/mi.pdf) p.g. 156, where they were oulining that I can use the include option.
Can anyone confirm that this method would be correct in what I want to do?:
mi impute chained (regress (include(x1, x2)) x12 (regress) y = age
p.s. I created a second regress in the code above for y, as I assumed by doing it this way I won't include x1 and x2 in imputing y.