"inverting" a relative risk using -margins-

Jeph Herrin

Join Date: Apr 2014

Posts: 332
#1

"inverting" a relative risk using -margins-

30 May 2018, 08:21

I am estimating a set of log binomial models using -glm-; from these, I would like to report relative risks (RRs). In general this is trivial, because RR=exp(coeff). However, for a couple of the models, the only way to get convergence is to use the complement of the outcome, that is, Z=1-Y in place of Y. This means that my RRs have an opposite interpretation, which is nusiance for reporting them all in a single table. So I would like to 'invert' these RRs for 1-Y to get the RR for Y.

If there are no covariates, I can do this quite simply with -nlcom-

Code:

glm z i.x1, family(binomia) link(log) nlcom (1-exp(_b[1.x1]+_b[_cons]))/(1-exp(_b[_cons]))

but naturally I have additional covariates and this trick doesn't work. I thought I could do something like this using -margins- , but I can't seem to get -margins- to give me the appropriate probabilities. That is,

margins x1, atmeans

reports what I thought were Pr(Z | x1=1) and Pr(Z|x1=0) from which I could get Pr(Y|x1=1) = 1-Pr(Z|x1=1) etc and construct the RR for the effect of x1 on Y, but the result I get does not agree with exp(coeff) from the Y model (in the case where I can model both Y and Z). I don't typically work with log binomial models or relative risks, so I feel like I'm missing something obvious.

thanks,
Jeph
Tags: None

Jeph Herrin

Join Date: Apr 2014
Posts: 332

30 May 2018, 08:48

Here's a more concrete example:

Code:

use http://www.stata-press.com/data/r15/lbw, clear
gen byte high=1-low 
glm low i.smoke ht, family(binomial) link(log) eform 
glm high i.smoke ht, family(binomial) link(log) eform

margins i.smoke, atmeans
di (1-.5903365)/(1-.7425416)

I would expect the last calculation to give me the RR from the first model, but it differs 1.59 vs 1.56.

Comment

Joerg Luedicke (StataCorp)

StataCorp Employee

Join Date: Apr 2014
Posts: 113

30 May 2018, 09:18

Hi Jeph,

You could do this using the expression() option of margins. Here is an example:

Code:

. * Toy data:
. clear
. sysuse auto
(1978 Automobile Data)
. rename foreign y
. gen z  = 1-y
. gen x1 = weight > 3000
. 
. * Model:
. glm z i.x1, family(binomial) link(log)

Iteration 0:   log likelihood = -69.809494  (not concave)
Iteration 1:   log likelihood =  -38.62661  
Iteration 2:   log likelihood = -33.803268  
Iteration 3:   log likelihood = -31.838851  
Iteration 4:   log likelihood = -31.790708  
Iteration 5:   log likelihood = -31.790431  
Iteration 6:   log likelihood = -31.790431  

Generalized linear models                         No. of obs      =         74
Optimization     : ML                             Residual df     =         72
                                                  Scale parameter =          1
Deviance         =  63.58086147                   (1/df) Deviance =   .8830675
Pearson          =           74                   (1/df) Pearson  =   1.027778

Variance function: V(u) = u*(1-u)                 [Bernoulli]
Link function    : g(u) = ln(u)                   [Log]

                                                  AIC             =   .9132549
Log likelihood   = -31.79043073                   BIC             =  -246.3118

------------------------------------------------------------------------------
             |                 OIM
           z |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        1.x1 |   .7946541   .1986989     4.00   0.000     .4052115    1.184097
       _cons |  -.8472979     .19518    -4.34   0.000    -1.229844   -.4647521
------------------------------------------------------------------------------
. 
. * Estimating risks using -margins-:
. margins, expression(1-exp(predict(xb))) at(x1 = (0 1)) post

Adjusted predictions                            Number of obs     =         74
Model VCE    : OIM

Expression   : 1-exp(predict(xb))

1._at        : x1              =           0

2._at        : x1              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .5714286   .0836486     6.83   0.000     .4074804    .7353768
          2  |   .0512821   .0353199     1.45   0.147    -.0179436    .1205077
------------------------------------------------------------------------------
. 
. * Risk ratio:
. nlcom _b[2._at]/_b[1bn._at]

       _nl_1:  _b[2._at]/_b[1bn._at]

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _nl_1 |   .0897436   .0631904     1.42   0.156    -.0341074    .2135945
------------------------------------------------------------------------------

The above yields the same result that we would get using the original variable and default predictions:

Code:

. * Results with original variable:
. glm y i.x1, family(binomial) link(log)

Iteration 0:   log likelihood = -44.832193  
Iteration 1:   log likelihood = -32.320585  
Iteration 2:   log likelihood = -31.819242  
Iteration 3:   log likelihood = -31.790511  
Iteration 4:   log likelihood = -31.790431  
Iteration 5:   log likelihood = -31.790431  

Generalized linear models                         No. of obs      =         74
Optimization     : ML                             Residual df     =         72
                                                  Scale parameter =          1
Deviance         =  63.58086147                   (1/df) Deviance =   .8830675
Pearson          =  73.99999998                   (1/df) Pearson  =   1.027778

Variance function: V(u) = u*(1-u)                 [Bernoulli]
Link function    : g(u) = ln(u)                   [Log]

                                                  AIC             =   .9132549
Log likelihood   = -31.79043073                   BIC             =  -246.3118

------------------------------------------------------------------------------
             |                 OIM
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        1.x1 |  -2.410799   .7041218    -3.42   0.001    -3.790852   -1.030745
       _cons |  -.5596158    .146385    -3.82   0.000    -.8465251   -.2727064
------------------------------------------------------------------------------

. margins, at(x1 = (0 1)) post

Adjusted predictions                            Number of obs     =         74
Model VCE    : OIM

Expression   : Predicted mean y, predict()

1._at        : x1              =           0

2._at        : x1              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .5714286   .0836486     6.83   0.000     .4074804    .7353768
          2  |   .0512821   .0353199     1.45   0.147    -.0179436    .1205077
------------------------------------------------------------------------------

. nlcom _b[2._at]/_b[1bn._at]

       _nl_1:  _b[2._at]/_b[1bn._at]

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _nl_1 |   .0897436   .0631904     1.42   0.156    -.0341074    .2135945
------------------------------------------------------------------------------

I hope this helps,

Joerg

Comment

Jeph Herrin

Join Date: Apr 2014
Posts: 332

30 May 2018, 10:15

Joerg,

Thanks for this. However, this doesn't seem to work if there are any covariates.

Code:

. 
. sysuse auto, clear
(1978 Automobile Data)

. rename foreign y

. gen z  = 1-y

. gen x1 = weight > 3000

. gen x2= headroom > 3

. 
. glm y i.x1 i.x2, family(binomial) link(log) eform nolog

Generalized linear models                         No. of obs      =         74
Optimization     : ML                             Residual df     =         71
                                                  Scale parameter =          1
Deviance         =  62.50805905                   (1/df) Deviance =   .8803952
Pearson          =  73.08224164                   (1/df) Pearson  =   1.029327

Variance function: V(u) = u*(1-u)                 [Bernoulli]
Link function    : g(u) = ln(u)                   [Log]

                                                  AIC             =   .9257846
Log likelihood   = -31.25402952                   BIC             =  -243.0806

------------------------------------------------------------------------------
             |                 OIM
           y | Risk Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        1.x1 |   .1285269   .1023399    -2.58   0.010     .0269912    .6120199
        1.x2 |   .5259416   .3902062    -0.87   0.386     .1228612     2.25144
       _cons |   .5948809   .0863165    -3.58   0.000     .4476326    .7905664
------------------------------------------------------------------------------
Note: _cons estimates baseline risk.

. margins, at(x1 = (0 1)) post

Predictive margins                              Number of obs     =         74
Model VCE    : OIM

Expression   : Predicted mean y, predict()

1._at        : x1              =           0

2._at        : x1              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .4805532   .1090046     4.41   0.000     .2669081    .6941984
          2  |    .061764   .0441922     1.40   0.162     -.024851    .1483791
------------------------------------------------------------------------------

. nlcom _b[2._at]/_b[1bn._at]

       _nl_1:  _b[2._at]/_b[1bn._at]

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _nl_1 |   .1285269   .1023399     1.26   0.209    -.0720556    .3291095
------------------------------------------------------------------------------

. 
. 
. glm z i.x1 i.x2, family(binomial) link(log) eform nolog

Generalized linear models                         No. of obs      =         74
Optimization     : ML                             Residual df     =         71
                                                  Scale parameter =          1
Deviance         =  63.10858038                   (1/df) Deviance =   .8888532
Pearson          =  73.91466324                   (1/df) Pearson  =   1.041052

Variance function: V(u) = u*(1-u)                 [Bernoulli]
Link function    : g(u) = ln(u)                   [Log]

                                                  AIC             =   .9338997
Log likelihood   = -31.55429019                   BIC             =    -242.48

------------------------------------------------------------------------------
             |                 OIM
           z | Risk Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        1.x1 |   2.124844   .4572846     3.50   0.000      1.39361    3.239761
        1.x2 |   1.062606   .1090718     0.59   0.554     .8689604    1.299404
       _cons |   .4271722   .0834032    -4.36   0.000     .2913468    .6263192
------------------------------------------------------------------------------
Note: _cons estimates baseline risk.

. margins, expression(1-exp(predict(xb))) at(x1 = (0 1)) post

Predictive margins                              Number of obs     =         74
Model VCE    : OIM

Expression   : 1-exp(predict(xb))

1._at        : x1              =           0

2._at        : x1              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .5619859   .0870172     6.46   0.000     .3914353    .7325365
          2  |   .0692884   .0539057     1.29   0.199    -.0363648    .1749416
------------------------------------------------------------------------------

. nlcom _b[2._at]/_b[1bn._at]

       _nl_1:  _b[2._at]/_b[1bn._at]

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _nl_1 |    .123292   .1006031     1.23   0.220    -.0738865    .3204706
------------------------------------------------------------------------------

which is what I ran into initially. Is there a fix?

thanks,
Jeph

Comment

Joerg Luedicke (StataCorp)

StataCorp Employee

Join Date: Apr 2014

Posts: 113
#5

30 May 2018, 13:21

Sorry, I was a bit too fast here. This will only work with one categorical predictor variable. In all other cases, the exponentiated linear predictions from a model with the original outcome variable and those from a model with the complement of the outcome do not sum to unity, as the inverse logits would if this were a logit model. Unlike the inverse logit function, the exponential function is not symmetric. If you are fitting log-binomial models solely for the purpose of estimating risk ratios instead of odds ratios, then my personal suggestion would be to just fit a logit or probit model instead and compute risk ratios using margins.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4437
#6

30 May 2018, 13:30

another possibility would be to use -poisson- (with robust SE's) - this has been discussed on this forum previously; do note that -poisson- is inefficient for this purpose but if your N is sizable that won't matter
Comment
Jeph Herrin

Join Date: Apr 2014

Posts: 332
#7

30 May 2018, 17:48

Thanks for the responses. We were using logit and a reviewer asked us to use a more appropriate model; we had trouble with the Poisson + robust SEs converging as well, so thought we'd try to do everything in log-binomial. But based on this thread, we're going to use logit and use margins to get RRs.
Comment

Announcement