The help file for mimrgns states that while "[i]n principle, marginsplot works after mimrgns […], the plotted confidence intervals are based on inappropriate degrees of freedom". However, it also suggests that "the differences should be small for large sample sizes". In this post, I demonstrate how to calculate these differences.
I borrow an example from https://www3.nd.edu/~rwilliam/stats3/MD02.pdf”]Richard Williams’ excellent paper on Multiple Imputation & Maximum Likelihood[/url]. We start with an example dataset where we impute missing values in one variable. Then, we run a logistic regression model and obtain predictive margins.*
Omitting most of the outout, the results for the mimrgns command are
Now, if we were to call marginsplot, the plotted confidence intervals would not match what mimrgns reports. How large would the differences be? To find out, we first store mimrgns' results in a matrix. More precisely, we store the coefficients, standard errors, and confidence limits. Those results are stored in rows 1, 2, 5, and 6 of r(table). Since we will later store these results as variables, we transpose the matrix.
Now, that we have stored mimrgns' results, we will replicate the file that marginsplot would use. To do so, we need to dig a little deeper into marginsplot's internals. marginsplot internally calls margins with its undocumented saving() option. Undocumented means that there is an online help file but no corresponding entry in the manual. More importantly, it means that this option might behave differently in the future, even under version control. Anyway, margins, with the saving() option, calls a utility routine, _marg_save, which is not documented. Not documented means that there is neither an online help file nor an entry in the manual. The code is implemented as an ado-file, so we can learn how it works. Obviously, it might work differently in the future.
We can now load the created file and have a look at some of the variables
The output is
From a first glance, it appears as if the results match those from mimrgns. To be sure, we will save the mimrgns results in the same dataset and calculate the respective differences.
Finally, we can look at the differences.
which shows
We find that the point estimates and the standard errors match exactly; the difference is flat 0. As stated in the help file, the confidence intervals do not match exactly. In this example, we find that the differences are in the 4th decimal place or later. We will not be able to spot those differences in a graph.
Note that there is no guarantee that the differences will always be that small. However, perhaps the waring in the help file overstates the problem. Anyway, you now know how to check when in doubt.
Best
Daniel
* I have modified the mimrgns call to include imputed variables in the at() option. The observed differences for confidence intervals are even smaller (virtually 0) in the original example.
I borrow an example from https://www3.nd.edu/~rwilliam/stats3/MD02.pdf”]Richard Williams’ excellent paper on Multiple Imputation & Maximum Likelihood[/url]. We start with an example dataset where we impute missing values in one variable. Then, we run a logistic regression model and obtain predictive margins.*
Code:
version 12.1 webuse mheart0, clear mi set mlong mi register imputed bmi mi register regular attack smokes age hsgrad female mi impute regress bmi attack smokes age hsgrad female, add(20) rseed(2232) mi estimate, dots: logit attack i.smokes age bmi i.hsgrad i.female mimrgns smokes, at(bmi = (20(5)35)) predict(pr) cmdmargins
Code:
(output omitted) . mimrgns smokes, at(bmi = (20(5)35)) predict(pr) cmdmargins Multiple-imputation estimates Imputations = 20 Predictive margins Number of obs = 154 Average RVI = 0.0419 Largest FMI = 0.1500 DF adjustment: Large sample DF: min = 866.55 avg = 41276.27 Within VCE type: Delta-method max = 191492.08 Expression : Pr(attack), predict(pr) 1._at : bmi = 20 2._at : bmi = 25 3._at : bmi = 30 4._at : bmi = 35 ------------------------------------------------------------------------------ | Margin Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _at#smokes | 1 0 | .2132581 .0594192 3.59 0.000 .0967207 .3297954 1 1 | .4716674 .0832583 5.67 0.000 .3084512 .6348835 2 0 | .3239331 .0498932 6.49 0.000 .2261437 .4217226 2 1 | .6118189 .0617438 9.91 0.000 .4908022 .7328357 3 0 | .4585168 .0750399 6.11 0.000 .3113833 .6056503 3 1 | .7356222 .071637 10.27 0.000 .5951187 .8761257 4 0 | .5985056 .1179053 5.08 0.000 .3671509 .8298602 4 1 | .830356 .0798759 10.40 0.000 .673583 .9871289 ------------------------------------------------------------------------------
Code:
matrix rtable = r(table) matrix rtable = (rtable[1..2, 1...]\ rtable[5..6, 1...])'
Code:
_marg_save , saving(mimrgns_results , double)
Code:
use mimrgns_results, clear list _margin _se _ci_lb _ci_ub , noobs separator(0)
Code:
. list _margin _se _ci_lb _ci_ub , noobs separator(0) +-----------------------------------------------+ | _margin _se _ci_lb _ci_ub | |-----------------------------------------------| | .21325808 .05941922 .09679856 .3297176 | | .47166739 .08325828 .30848416 .63485062 | | .32393311 .04989317 .22614429 .42172193 | | .61181892 .06174378 .49080334 .7328345 | | .45851679 .07503986 .31144138 .60559221 | | .73562219 .07163704 .59521618 .8760282 | | .59850555 .11790533 .36741535 .82959576 | | .83035595 .07987593 .67380201 .98690989 | +-----------------------------------------------+
Code:
svmat double rtable , names(col) local vars _margin b _se se _ci_lb ll _ci_ub ul forvalues i = 1(2)8 { local var1 : word `i' of `vars' local var2 : word `++i' of `vars' generate double diff_`var2' = abs(`var1'-`var2') }
Code:
list _margin b diff_b _se se diff_se _ci_lb ll diff_ll _ci_ub ul diff_ul , noobs separator(0)
Code:
. list _margin b diff_b _se se diff_se _ci_lb ll diff_ll _ci_ub ul diff_ul , noobs separator(0) +------------------------------------------------------------------------------------------------------------------------------------------+ | _margin b diff_b _se se diff_se _ci_lb ll diff_ll _ci_ub ul diff_ul | |------------------------------------------------------------------------------------------------------------------------------------------| | .21325808 .21325808 0 .05941922 .05941922 0 .09679856 .09672075 .00007781 .3297176 .32979541 .00007781 | | .47166739 .47166739 0 .08325828 .08325828 0 .30848416 .30845124 .00003292 .63485062 .63488354 .00003292 | | .32393311 .32393311 0 .04989317 .04989317 0 .22614429 .22614367 6.181e-07 .42172193 .42172255 6.181e-07 | | .61181892 .61181892 0 .06174378 .06174378 0 .49080334 .49080216 1.180e-06 .7328345 .73283568 1.180e-06 | | .45851679 .45851679 0 .07503986 .07503986 0 .31144138 .3113833 .00005808 .60559221 .60565029 .00005808 | | .73562219 .73562219 0 .07163704 .07163704 0 .59521618 .59511871 .00009746 .8760282 .87612566 .00009746 | | .59850555 .59850555 0 .11790533 .11790533 0 .36741535 .36715088 .00026447 .82959576 .82986022 .00026447 | | .83035595 .83035595 0 .07987593 .07987593 0 .67380201 .67358305 .00021897 .98690989 .98712886 .00021897 | +------------------------------------------------------------------------------------------------------------------------------------------+
Note that there is no guarantee that the differences will always be that small. However, perhaps the waring in the help file overstates the problem. Anyway, you now know how to check when in doubt.
Best
Daniel
* I have modified the mimrgns call to include imputed variables in the at() option. The observed differences for confidence intervals are even smaller (virtually 0) in the original example.