I am trying to do an in-sample test of an imputation method using "mi impute." Basically I want to run the imputation 400 times with various subsamples to see the degree of the bias. I am sort of new to this type of analysis but I put together some code that did the job - even if it was not very elegant. Yesterday I tweaked the input variables and hit a weird error.
My code is as follows:
foreach x of numlist 1 2 3 4 5 10 15 25 35 50 {
keep if SURVEYR==2007
svyset cluster [w=WTA_ADQ], strata(PROVINCE)
forvalues i = 1(1)400 {
preserve
set seed `i'
gen rand=runiform()
xtile perc=rand, nq(100)
qui gen lnpcexp2=lnpcexp if perc<=`x'
qui gen imputed=(perc>`x')
qui mi set flong
qui mi register imputed lnpcexp2
qui mi register regular `model_nat_nogis' lnpcexp
qui mi impute regress lnpcexp2 `model_nat_nogis', add(10) rseed(16071847)
qui mean lnpcexp2 [aw=WTA_ADQ], over(imputed)
qui gen poor2=lnpcexp2<lnzref
qui mi estimate: svy: mean poor2
qui matrix b=e(b_mi)
if `i'==1 matrix betas=b
else matrix betas=(betas \ b)
restore
}
di `x'
matrix list betas
}
When I run it like this, I get the following error "mi impute: VCE is not positive definite
The posterior distribution from which mi impute drew the imputations for lnpcexp2 is not proper when the VCE estimated from the observed data is not positive definite. This may
happen, for example, when the number of parameters exceeds the number of observations. Choose an alternate imputation model."
But if I run it with "forvalues i = 1(1)9 {" instead it works fine. Since the `i' variable is only used for setting the seed and starting the matrix, I can't think of any reason that it shouldn't work for two digit numbers. (It also does not work for forvalues i = 10(1)11.)
Again, apologies for the inelegant code and extended question. But this is driving me crazy.
My code is as follows:
foreach x of numlist 1 2 3 4 5 10 15 25 35 50 {
keep if SURVEYR==2007
svyset cluster [w=WTA_ADQ], strata(PROVINCE)
forvalues i = 1(1)400 {
preserve
set seed `i'
gen rand=runiform()
xtile perc=rand, nq(100)
qui gen lnpcexp2=lnpcexp if perc<=`x'
qui gen imputed=(perc>`x')
qui mi set flong
qui mi register imputed lnpcexp2
qui mi register regular `model_nat_nogis' lnpcexp
qui mi impute regress lnpcexp2 `model_nat_nogis', add(10) rseed(16071847)
qui mean lnpcexp2 [aw=WTA_ADQ], over(imputed)
qui gen poor2=lnpcexp2<lnzref
qui mi estimate: svy: mean poor2
qui matrix b=e(b_mi)
if `i'==1 matrix betas=b
else matrix betas=(betas \ b)
restore
}
di `x'
matrix list betas
}
When I run it like this, I get the following error "mi impute: VCE is not positive definite
The posterior distribution from which mi impute drew the imputations for lnpcexp2 is not proper when the VCE estimated from the observed data is not positive definite. This may
happen, for example, when the number of parameters exceeds the number of observations. Choose an alternate imputation model."
But if I run it with "forvalues i = 1(1)9 {" instead it works fine. Since the `i' variable is only used for setting the seed and starting the matrix, I can't think of any reason that it shouldn't work for two digit numbers. (It also does not work for forvalues i = 10(1)11.)
Again, apologies for the inelegant code and extended question. But this is driving me crazy.