I am conducting stepwise regression within multiply imputed data, using the approach as specified by Wood, White & Royston (2008) https://onlinelibrary.wiley.com/doi/....1002/sim.3177
I have ran my multiple imputation successfully, creating 20 datasets. Please see code below.
The challenge arises when I attempt to use the Wald method for variable selection within my dataset. Here, I fit the model with all predictors in each dataset, then I calculate the wald statistic for each predictor and pool these using Rubin's Rules. Then for each dataset I want to identify the variable with the smallest wald chi2, get it's p value and remove it from the model if it exceeds p = 0.1. I then want to repeat the process in all imputed datasets until all remaining variables meet my p value criterion.
When I create my loop to evaluate the wald statistics, two errors arise:
Errors:
1, gen dataset = `m'
invalid syntax
r(198);
2. . }
} is not a valid command name
r(199);
What am I doing wrong?
Loop code:
forval `m' = 1/20 {
preserve
di "Running Poisson Regression for Imputed dataset `m'"
mi extract`m'
poisson diagnos female ethn anx medinc bully sociso
matrix b_vec = e(b)
mat v = e(V)
mat list v
mata:
b = st_matrix("b_vec")
V = st_matrix("v")
V = diagonal(V)
se_t = sqrt(V)
se = se_t'
b
se
wald = b:/se
st_matrix("wald", wald)
end
matrix list wald
svmat wald
keep wald*
drop if wald1 == .
gen dataset = `m'
save wald_results_`m'.dta, replace
}
Multiple imputation:
mi set wide
mi register imputed diagnos female ethn anx medinc bully sociso
mi impute chained (logit) diagnos female ethn anx medinc bully sociso, add(20) rseed(12345)
I have ran my multiple imputation successfully, creating 20 datasets. Please see code below.
The challenge arises when I attempt to use the Wald method for variable selection within my dataset. Here, I fit the model with all predictors in each dataset, then I calculate the wald statistic for each predictor and pool these using Rubin's Rules. Then for each dataset I want to identify the variable with the smallest wald chi2, get it's p value and remove it from the model if it exceeds p = 0.1. I then want to repeat the process in all imputed datasets until all remaining variables meet my p value criterion.
When I create my loop to evaluate the wald statistics, two errors arise:
Errors:
1, gen dataset = `m'
invalid syntax
r(198);
2. . }
} is not a valid command name
r(199);
What am I doing wrong?
Loop code:
forval `m' = 1/20 {
preserve
di "Running Poisson Regression for Imputed dataset `m'"
mi extract`m'
poisson diagnos female ethn anx medinc bully sociso
matrix b_vec = e(b)
mat v = e(V)
mat list v
mata:
b = st_matrix("b_vec")
V = st_matrix("v")
V = diagonal(V)
se_t = sqrt(V)
se = se_t'
b
se
wald = b:/se
st_matrix("wald", wald)
end
matrix list wald
svmat wald
keep wald*
drop if wald1 == .
gen dataset = `m'
save wald_results_`m'.dta, replace
}
Multiple imputation:
mi set wide
mi register imputed diagnos female ethn anx medinc bully sociso
mi impute chained (logit) diagnos female ethn anx medinc bully sociso, add(20) rseed(12345)
Comment