-mimrgns- and -marginsplot-

daniel klein

Join Date: Mar 2014
Posts: 3820

-mimrgns- and -marginsplot-

30 Jan 2019, 14:01

The help file for mimrgns states that while "[i]n principle, marginsplot works after mimrgns […], the plotted confidence intervals are based on inappropriate degrees of freedom". However, it also suggests that "the differences should be small for large sample sizes". In this post, I demonstrate how to calculate these differences.

I borrow an example from https://www3.nd.edu/~rwilliam/stats3/MD02.pdf”]Richard Williams’ excellent paper on Multiple Imputation & Maximum Likelihood[/url]. We start with an example dataset where we impute missing values in one variable. Then, we run a logistic regression model and obtain predictive margins.*

Code:

version 12.1
webuse mheart0, clear
mi set mlong
mi register imputed bmi
mi register regular attack smokes age hsgrad female
mi impute regress bmi attack smokes age hsgrad female, add(20) rseed(2232)
mi estimate, dots: logit attack i.smokes age bmi i.hsgrad i.female
mimrgns smokes, at(bmi = (20(5)35)) predict(pr) cmdmargins

Omitting most of the outout, the results for the mimrgns command are

Code:

(output omitted)
. mimrgns smokes, at(bmi = (20(5)35)) predict(pr) cmdmargins

Multiple-imputation estimates                     Imputations     =         20
Predictive margins                                Number of obs   =        154
                                                  Average RVI     =     0.0419
                                                  Largest FMI     =     0.1500
DF adjustment:   Large sample                     DF:     min     =     866.55
                                                          avg     =   41276.27
Within VCE type: Delta-method                             max     =  191492.08

Expression   : Pr(attack), predict(pr)

1._at        : bmi             =          20

2._at        : bmi             =          25

3._at        : bmi             =          30

4._at        : bmi             =          35

------------------------------------------------------------------------------
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  _at#smokes |
        1 0  |   .2132581   .0594192     3.59   0.000     .0967207    .3297954
        1 1  |   .4716674   .0832583     5.67   0.000     .3084512    .6348835
        2 0  |   .3239331   .0498932     6.49   0.000     .2261437    .4217226
        2 1  |   .6118189   .0617438     9.91   0.000     .4908022    .7328357
        3 0  |   .4585168   .0750399     6.11   0.000     .3113833    .6056503
        3 1  |   .7356222    .071637    10.27   0.000     .5951187    .8761257
        4 0  |   .5985056   .1179053     5.08   0.000     .3671509    .8298602
        4 1  |    .830356   .0798759    10.40   0.000      .673583    .9871289
------------------------------------------------------------------------------

Now, if we were to call marginsplot, the plotted confidence intervals would not match what mimrgns reports. How large would the differences be? To find out, we first store mimrgns' results in a matrix. More precisely, we store the coefficients, standard errors, and confidence limits. Those results are stored in rows 1, 2, 5, and 6 of r(table). Since we will later store these results as variables, we transpose the matrix.

Code:

matrix rtable = r(table)
matrix rtable  = (rtable[1..2, 1...]\ rtable[5..6, 1...])'

Now, that we have stored mimrgns' results, we will replicate the file that marginsplot would use. To do so, we need to dig a little deeper into marginsplot's internals. marginsplot internally calls margins with its undocumented saving() option. Undocumented means that there is an online help file but no corresponding entry in the manual. More importantly, it means that this option might behave differently in the future, even under version control. Anyway, margins, with the saving() option, calls a utility routine, _marg_save, which is not documented. Not documented means that there is neither an online help file nor an entry in the manual. The code is implemented as an ado-file, so we can learn how it works. Obviously, it might work differently in the future.

Code:

_marg_save , saving(mimrgns_results , double)

We can now load the created file and have a look at some of the variables

Code:

use mimrgns_results, clear
list _margin _se _ci_lb _ci_ub , noobs separator(0)

The output is

Code:

. list _margin _se _ci_lb _ci_ub , noobs separator(0)

  +-----------------------------------------------+
  |   _margin         _se      _ci_lb      _ci_ub |
  |-----------------------------------------------|
  | .21325808   .05941922   .09679856    .3297176 |
  | .47166739   .08325828   .30848416   .63485062 |
  | .32393311   .04989317   .22614429   .42172193 |
  | .61181892   .06174378   .49080334    .7328345 |
  | .45851679   .07503986   .31144138   .60559221 |
  | .73562219   .07163704   .59521618    .8760282 |
  | .59850555   .11790533   .36741535   .82959576 |
  | .83035595   .07987593   .67380201   .98690989 |
  +-----------------------------------------------+

From a first glance, it appears as if the results match those from mimrgns. To be sure, we will save the mimrgns results in the same dataset and calculate the respective differences.

Code:

svmat double rtable , names(col)
local vars _margin b _se se _ci_lb ll _ci_ub ul
forvalues i = 1(2)8 {
    local var1 : word   `i' of `vars'
    local var2 : word `++i' of `vars'
    generate double diff_`var2' = abs(`var1'-`var2')
}

Finally, we can look at the differences.

Code:

list _margin b diff_b _se se diff_se _ci_lb ll diff_ll _ci_ub ul diff_ul , noobs separator(0)

which shows

Code:

. list _margin b diff_b _se se diff_se _ci_lb ll diff_ll _ci_ub ul diff_ul , noobs separator(0)

  +------------------------------------------------------------------------------------------------------------------------------------------+
  |   _margin           b   diff_b         _se          se   diff_se      _ci_lb          ll     diff_ll      _ci_ub          ul     diff_ul |
  |------------------------------------------------------------------------------------------------------------------------------------------|
  | .21325808   .21325808        0   .05941922   .05941922         0   .09679856   .09672075   .00007781    .3297176   .32979541   .00007781 |
  | .47166739   .47166739        0   .08325828   .08325828         0   .30848416   .30845124   .00003292   .63485062   .63488354   .00003292 |
  | .32393311   .32393311        0   .04989317   .04989317         0   .22614429   .22614367   6.181e-07   .42172193   .42172255   6.181e-07 |
  | .61181892   .61181892        0   .06174378   .06174378         0   .49080334   .49080216   1.180e-06    .7328345   .73283568   1.180e-06 |
  | .45851679   .45851679        0   .07503986   .07503986         0   .31144138    .3113833   .00005808   .60559221   .60565029   .00005808 |
  | .73562219   .73562219        0   .07163704   .07163704         0   .59521618   .59511871   .00009746    .8760282   .87612566   .00009746 |
  | .59850555   .59850555        0   .11790533   .11790533         0   .36741535   .36715088   .00026447   .82959576   .82986022   .00026447 |
  | .83035595   .83035595        0   .07987593   .07987593         0   .67380201   .67358305   .00021897   .98690989   .98712886   .00021897 |
  +------------------------------------------------------------------------------------------------------------------------------------------+

We find that the point estimates and the standard errors match exactly; the difference is flat 0. As stated in the help file, the confidence intervals do not match exactly. In this example, we find that the differences are in the 4th decimal place or later. We will not be able to spot those differences in a graph.

Note that there is no guarantee that the differences will always be that small. However, perhaps the waring in the help file overstates the problem. Anyway, you now know how to check when in doubt.

Best
Daniel

* I have modified the mimrgns call to include imputed variables in the at() option. The observed differences for confidence intervals are even smaller (virtually 0) in the original example.

Last edited by daniel klein; 30 Jan 2019, 14:03.

Tags: margins, marginsplot, mimrgns, multiple imputation

Announcement

-mimrgns- and -marginsplot-