I'm using Stata 14.1. I have panel data observed at baseline and at 6 follow-up assessments. The response variable is an over dispersed count variable with missing values. 15 of 226 observations were missing at all follow-up assessments. I'm trying to generate 20 complete data sets using chained equation. There is no missing data for any of the right hand side variables. Here is my code:
I then get a table summarizing the imputations:
------------------------------------------------------------------
| Observations per m
|----------------------------------------------
Variable | Complete Incomplete Imputed | Total
-------------------+-----------------------------------+----------
acount1 | 195 31 27 | 226
acount3 | 178 48 42 | 226
acount6 | 173 53 45 | 226
acount9 | 162 64 52 | 226
acount12 | 161 65 53 | 226
acount15 | 160 66 55 | 226
------------------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
of the number of filled-in observations.)
Note: Right-hand-side variables (or weights) have missing values;
model parameters estimated using listwise deletion.
The command is not fully populating the imputed data sets. For acount1 the minimum imputed is 27, while 31 observations have incomplete data. This occurs when I use the orderasis option or the nomonotone option. It also occurs if I specify:
mi impute chained (poisson). However, if I treat the data as continuous and specify mi impute (reg) I'm able to generate 20 fully populated imputed data sets, albeit the wrong model specification for this outcome.
If I don't use the force option I get an r(498) error:
acount6: missing imputed values produced
This may occur when imputation variables are used as independent variables or when independent variables contain missing values. You can specify option force if you wish to proceed anyway.
Finally, if I use the noimputed option the procedure runs without error and fully populates all 20 imputed data sets. However, since these are panel data and most subjects are observed on multiple occasions, this would exclude the most valuable information regarding missing values.
Any thoughts or suggestions would be much appreciated.
Brad
Code:
mi impute chained (nbreg) acount1 acount3 acount6 acount9 acount12 acount15 /// = cond age gender white hispanic black schfull schpart employed /// agealc nalabdis agemj nmjabdis blarate blbrate blmjrate /// , add(20) force
------------------------------------------------------------------
| Observations per m
|----------------------------------------------
Variable | Complete Incomplete Imputed | Total
-------------------+-----------------------------------+----------
acount1 | 195 31 27 | 226
acount3 | 178 48 42 | 226
acount6 | 173 53 45 | 226
acount9 | 162 64 52 | 226
acount12 | 161 65 53 | 226
acount15 | 160 66 55 | 226
------------------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
of the number of filled-in observations.)
Note: Right-hand-side variables (or weights) have missing values;
model parameters estimated using listwise deletion.
The command is not fully populating the imputed data sets. For acount1 the minimum imputed is 27, while 31 observations have incomplete data. This occurs when I use the orderasis option or the nomonotone option. It also occurs if I specify:
mi impute chained (poisson). However, if I treat the data as continuous and specify mi impute (reg) I'm able to generate 20 fully populated imputed data sets, albeit the wrong model specification for this outcome.
If I don't use the force option I get an r(498) error:
acount6: missing imputed values produced
This may occur when imputation variables are used as independent variables or when independent variables contain missing values. You can specify option force if you wish to proceed anyway.
Finally, if I use the noimputed option the procedure runs without error and fully populates all 20 imputed data sets. However, since these are panel data and most subjects are observed on multiple occasions, this would exclude the most valuable information regarding missing values.
Any thoughts or suggestions would be much appreciated.
Brad
Comment