Storing confidence intervals

Jad Tamimi

Join Date: Sep 2023

Posts: 115
#1

Storing confidence intervals

29 Sep 2023, 13:17

Hello, Im running a regression and would like to store the lower bound and upper bound in two variables (one for lower bound and one for upper bound) I know that the command for storing the coefficient is for example gen coef = _b[varname], what is the similar code for storing the lower and upper bounds?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29961
#2

29 Sep 2023, 13:28

There is nothing exactly analogous to _b[varname] for the confidence bounds. However, you can get them as follows:

Code:

estimation_command dv ivs, options matrix M = r(table) gen lb_varname = M["ll", "varname"] gen ub_varname = M["ul", "varname"]

Note: since the regression results, both coefficients and confidence limits, are attributes of the entire estimation sample, it may not make sense to store those results in new variables, where the same number is repeated in all observations of the data set. It is often more appropriate to store them as scalars or local macros. (If, however, you are repeating the regression on various subsets of the data, it may make sense to store the results in a variable, with the results for each subset in the observations corresponding to the subset.)
Comment
Jad Tamimi

Join Date: Sep 2023

Posts: 115
#3

29 Sep 2023, 13:47

Hello Clyde, thank you for replying. Here is the code I'm working with:

foreach x in "7756" "7783" "10540" "10660" "10993" "10936" "12058" "13321" "13333" "13336" "17851" "17863" "17866" "17869" "20311" "23596" "25432" "26374" "26383" "26398" "26473" "26494" "26503" "27109" "27112" {
reghdfe empend pre3_2003 pre2_2003 post0_2003-post9_2003 if wardnew != `x', absorb(wardnew year) cluster(wardnew)
replace coef = _b[post4_2003] if coef==`x'
}

Each 4 or 5 digit code here represents a regression where the observation containing that code was dropped, the coef variable will spit out the coefficients for each regression. Im trying to do something similar for the upper and lower bounds for each regression. I initially generated the coef variable to be equal to the wardnew variable (this contains the 4 or 5 digit codes). I have generated the upper and lower bound variables so that they are equal to the coef variable which contains the 4 or 5 digit codes. How do you reckon I implement the upper and lower bounds similarly ?
Comment
Jad Tamimi

Join Date: Sep 2023

Posts: 115
#4

29 Sep 2023, 16:32

I solved this problem using a slightly different method (by storing the standard errors instead and manually computing the upper and lower bounds). Thanks for your help!
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 686
#5

30 Sep 2023, 10:25

Starting in Stata 17 (but not documented until Stata 18), you can get the lower and upper bounds of each coefficient's CI via system variables _r_lb and _r_ub. In Stata 18, see help _variables.
3 likes
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29961
#6

30 Sep 2023, 10:50

Jeff Pitblado (StataCorp) That's great! I wasn't aware of it.
2 likes
Comment

Dirk Enzmann

Join Date: Apr 2014
Posts: 523

02 Oct 2023, 10:54

Me neither was aware of what Jeff Pitblado (StataCorp) showed us in #5.

However, the uninitiated users should note that _r_lb and _r_ub use r-returns (presumably from r(table)), not e-returns. This, for example, may be relevant if you want to obtain the CI of odds ratios after running a logit regression:

Code:

. syususe auto
. logit foreign price

Iteration 0: Log likelihood = -45.03321
Iteration 1: Log likelihood = -44.947363
Iteration 2: Log likelihood = -44.94724
Iteration 3: Log likelihood = -44.94724

Logistic regression Number of obs = 74
LR chi2(1) = 0.17
Prob > chi2 = 0.6784
Log likelihood = -44.94724 Pseudo R2 = 0.0019

------------------------------------------------------------------------------
foreign | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
price | .0000353 .0000844 0.42 0.676 -.0001301 .0002006
_cons | -1.079792 .5878344 -1.84 0.066 -2.231927 .0723419
------------------------------------------------------------------------------

.
. di "coeff: " _b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons] _n ///
> " OR: " exp(_b[_cons]) " CI: " exp(_r_lb[_cons]) " " exp(_r_ub[_cons])
coeff: -1.0797924 CI: -2.2319267 .07234185
   OR: .33966602  CI: .10732145 1.0750228

.
. logit, or

Logistic regression Number of obs = 74
LR chi2(1) = 0.17
Prob > chi2 = 0.6784
Log likelihood = -44.94724 Pseudo R2 = 0.0019

------------------------------------------------------------------------------
foreign | Odds ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
price | 1.000035 .0000844 0.42 0.676 .9998699 1.000201
_cons | .339666 .1996674 -1.84 0.066 .1073214 1.075023
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

.
. di "coeff: " _b[_cons] " CI: " ln(_r_lb[_cons]) " " ln(_r_ub[_cons]) _n ///
> " OR: " exp(_b[_cons]) " CI: " _r_lb[_cons] " " _r_ub[_cons]
coeff: -1.0797924 CI: -2.2319267 .07234185
   OR: .33966602  CI: .10732145 1.0750228

Last edited by Dirk Enzmann; 02 Oct 2023, 11:00.

Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 686

02 Oct 2023, 11:14

These system variables allow you to access the reported values for the
current estimation results, and are updated any-time you replay with a
different level() or or-like option. The values they
access are updated but stay with the estimation results. Also, the
system variable _r_b will show the odds ratios (and odds for
_cons) if you requested them via option or at estimation
or in a replay afterward.

Adding on to Dirk's example, I add a call to summarize to change the
contents of r(), but the system variables continue to work.

Code:

. sysuse auto
(1978 automobile data)

. 
. logit foreign price

Iteration 0:   log likelihood =  -45.03321  
Iteration 1:   log likelihood = -44.947363  
Iteration 2:   log likelihood =  -44.94724  
Iteration 3:   log likelihood =  -44.94724  

Logistic regression                                     Number of obs =     74
                                                        LR chi2(1)    =   0.17
                                                        Prob > chi2   = 0.6784
Log likelihood = -44.94724                              Pseudo R2     = 0.0019

------------------------------------------------------------------------------
     foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       price |   .0000353   .0000844     0.42   0.676    -.0001301    .0002006
       _cons |  -1.079792   .5878344    -1.84   0.066    -2.231927    .0723419
------------------------------------------------------------------------------

. 
. di "coeff: " _b[_cons] _n ///
> "coeff: " _r_b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons] ///
> 
coeff: -1.0797924
coeff: -1.0797924 CI: -2.2319267 .07234185

. logit, or

Logistic regression                                     Number of obs =     74
                                                        LR chi2(1)    =   0.17
                                                        Prob > chi2   = 0.6784
Log likelihood = -44.94724                              Pseudo R2     = 0.0019

------------------------------------------------------------------------------
     foreign | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       price |   1.000035   .0000844     0.42   0.676     .9998699    1.000201
       _cons |    .339666   .1996674    -1.84   0.066     .1073214    1.075023
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

. 
. di "coeff: " _b[_cons] _n ///
> "coeff: " ln(_r_b[_cons]) " CI: " ln(_r_lb[_cons]) " " ln(_r_ub[_cons]) _n ///
> " odds: " _r_b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons]
coeff: -1.0797924
coeff: -1.0797924 CI: -2.2319267 .07234185
 odds: .33966602 CI: .10732145 1.0750228

. 
. summarize mpg

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
         mpg |         74     21.2973    5.785503         12         41

. return list

scalars:
                  r(N) =  74
              r(sum_w) =  74
               r(mean) =  21.2972972972973
                r(Var) =  33.47204738985561
                 r(sd) =  5.785503209735141
                r(min) =  12
                r(max) =  41
                r(sum) =  1576

. 
. di "coeff: " _b[_cons] _n ///
> "coeff: " ln(_r_b[_cons]) " CI: " ln(_r_lb[_cons]) " " ln(_r_ub[_cons]) _n ///
> " odds: " _r_b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons]
coeff: -1.0797924
coeff: -1.0797924 CI: -2.2319267 .07234185
 odds: .33966602 CI: .10732145 1.0750228

Announcement