Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Storing confidence intervals

    Hello, Im running a regression and would like to store the lower bound and upper bound in two variables (one for lower bound and one for upper bound) I know that the command for storing the coefficient is for example gen coef = _b[varname], what is the similar code for storing the lower and upper bounds?

  • #2
    There is nothing exactly analogous to _b[varname] for the confidence bounds. However, you can get them as follows:

    Code:
    estimation_command dv ivs, options
    matrix M = r(table)
    gen lb_varname = M["ll", "varname"]
    gen ub_varname = M["ul", "varname"]
    Note: since the regression results, both coefficients and confidence limits, are attributes of the entire estimation sample, it may not make sense to store those results in new variables, where the same number is repeated in all observations of the data set. It is often more appropriate to store them as scalars or local macros. (If, however, you are repeating the regression on various subsets of the data, it may make sense to store the results in a variable, with the results for each subset in the observations corresponding to the subset.)

    Comment


    • #3
      Hello Clyde, thank you for replying. Here is the code I'm working with:


      foreach x in "7756" "7783" "10540" "10660" "10993" "10936" "12058" "13321" "13333" "13336" "17851" "17863" "17866" "17869" "20311" "23596" "25432" "26374" "26383" "26398" "26473" "26494" "26503" "27109" "27112" {
      reghdfe empend pre3_2003 pre2_2003 post0_2003-post9_2003 if wardnew != `x', absorb(wardnew year) cluster(wardnew)
      replace coef = _b[post4_2003] if coef==`x'
      }

      Each 4 or 5 digit code here represents a regression where the observation containing that code was dropped, the coef variable will spit out the coefficients for each regression. Im trying to do something similar for the upper and lower bounds for each regression. I initially generated the coef variable to be equal to the wardnew variable (this contains the 4 or 5 digit codes). I have generated the upper and lower bound variables so that they are equal to the coef variable which contains the 4 or 5 digit codes. How do you reckon I implement the upper and lower bounds similarly ?

      Comment


      • #4
        I solved this problem using a slightly different method (by storing the standard errors instead and manually computing the upper and lower bounds). Thanks for your help!

        Comment


        • #5
          Starting in Stata 17 (but not documented until Stata 18), you can get the lower and upper bounds of each coefficient's CI via system variables _r_lb and _r_ub. In Stata 18, see help _variables.

          Comment


          • #6
            Jeff Pitblado (StataCorp) That's great! I wasn't aware of it.

            Comment


            • #7
              Me neither was aware of what Jeff Pitblado (StataCorp) showed us in #5.

              However, the uninitiated users should note that _r_lb and _r_ub use r-returns (presumably from r(table)), not e-returns. This, for example, may be relevant if you want to obtain the CI of odds ratios after running a logit regression:
              Code:
              . syususe auto
              . logit foreign price
              
              Iteration 0: Log likelihood = -45.03321
              Iteration 1: Log likelihood = -44.947363
              Iteration 2: Log likelihood = -44.94724
              Iteration 3: Log likelihood = -44.94724
              
              Logistic regression Number of obs = 74
              LR chi2(1) = 0.17
              Prob > chi2 = 0.6784
              Log likelihood = -44.94724 Pseudo R2 = 0.0019
              
              ------------------------------------------------------------------------------
              foreign | Coefficient Std. err. z P>|z| [95% conf. interval]
              -------------+----------------------------------------------------------------
              price | .0000353 .0000844 0.42 0.676 -.0001301 .0002006
              _cons | -1.079792 .5878344 -1.84 0.066 -2.231927 .0723419
              ------------------------------------------------------------------------------
              
              .
              . di "coeff: " _b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons] _n ///
              > " OR: " exp(_b[_cons]) " CI: " exp(_r_lb[_cons]) " " exp(_r_ub[_cons])
              coeff: -1.0797924 CI: -2.2319267 .07234185
                 OR: .33966602  CI: .10732145 1.0750228
              
              .
              . logit, or
              
              Logistic regression Number of obs = 74
              LR chi2(1) = 0.17
              Prob > chi2 = 0.6784
              Log likelihood = -44.94724 Pseudo R2 = 0.0019
              
              ------------------------------------------------------------------------------
              foreign | Odds ratio Std. err. z P>|z| [95% conf. interval]
              -------------+----------------------------------------------------------------
              price | 1.000035 .0000844 0.42 0.676 .9998699 1.000201
              _cons | .339666 .1996674 -1.84 0.066 .1073214 1.075023
              ------------------------------------------------------------------------------
              Note: _cons estimates baseline odds.
              
              .
              . di "coeff: " _b[_cons] " CI: " ln(_r_lb[_cons]) " " ln(_r_ub[_cons]) _n ///
              > " OR: " exp(_b[_cons]) " CI: " _r_lb[_cons] " " _r_ub[_cons]
              coeff: -1.0797924 CI: -2.2319267 .07234185
                 OR: .33966602  CI: .10732145 1.0750228
              Last edited by Dirk Enzmann; 02 Oct 2023, 11:00.

              Comment


              • #8
                These system variables allow you to access the reported values for the
                current estimation results, and are updated any-time you replay with a
                different level() or or-like option. The values they
                access are updated but stay with the estimation results. Also, the
                system variable _r_b will show the odds ratios (and odds for
                _cons) if you requested them via option or at estimation
                or in a replay afterward.

                Adding on to Dirk's example, I add a call to summarize to change the
                contents of r(), but the system variables continue to work.
                Code:
                . sysuse auto
                (1978 automobile data)
                
                . 
                . logit foreign price
                
                Iteration 0:   log likelihood =  -45.03321  
                Iteration 1:   log likelihood = -44.947363  
                Iteration 2:   log likelihood =  -44.94724  
                Iteration 3:   log likelihood =  -44.94724  
                
                Logistic regression                                     Number of obs =     74
                                                                        LR chi2(1)    =   0.17
                                                                        Prob > chi2   = 0.6784
                Log likelihood = -44.94724                              Pseudo R2     = 0.0019
                
                ------------------------------------------------------------------------------
                     foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                       price |   .0000353   .0000844     0.42   0.676    -.0001301    .0002006
                       _cons |  -1.079792   .5878344    -1.84   0.066    -2.231927    .0723419
                ------------------------------------------------------------------------------
                
                . 
                . di "coeff: " _b[_cons] _n ///
                > "coeff: " _r_b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons] ///
                > 
                coeff: -1.0797924
                coeff: -1.0797924 CI: -2.2319267 .07234185
                
                . logit, or
                
                Logistic regression                                     Number of obs =     74
                                                                        LR chi2(1)    =   0.17
                                                                        Prob > chi2   = 0.6784
                Log likelihood = -44.94724                              Pseudo R2     = 0.0019
                
                ------------------------------------------------------------------------------
                     foreign | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                       price |   1.000035   .0000844     0.42   0.676     .9998699    1.000201
                       _cons |    .339666   .1996674    -1.84   0.066     .1073214    1.075023
                ------------------------------------------------------------------------------
                Note: _cons estimates baseline odds.
                
                . 
                . di "coeff: " _b[_cons] _n ///
                > "coeff: " ln(_r_b[_cons]) " CI: " ln(_r_lb[_cons]) " " ln(_r_ub[_cons]) _n ///
                > " odds: " _r_b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons]
                coeff: -1.0797924
                coeff: -1.0797924 CI: -2.2319267 .07234185
                 odds: .33966602 CI: .10732145 1.0750228
                
                . 
                . summarize mpg
                
                    Variable |        Obs        Mean    Std. dev.       Min        Max
                -------------+---------------------------------------------------------
                         mpg |         74     21.2973    5.785503         12         41
                
                . return list
                
                scalars:
                                  r(N) =  74
                              r(sum_w) =  74
                               r(mean) =  21.2972972972973
                                r(Var) =  33.47204738985561
                                 r(sd) =  5.785503209735141
                                r(min) =  12
                                r(max) =  41
                                r(sum) =  1576
                
                . 
                . di "coeff: " _b[_cons] _n ///
                > "coeff: " ln(_r_b[_cons]) " CI: " ln(_r_lb[_cons]) " " ln(_r_ub[_cons]) _n ///
                > " odds: " _r_b[_cons] " CI: " _r_lb[_cons] " " _r_ub[_cons]
                coeff: -1.0797924
                coeff: -1.0797924 CI: -2.2319267 .07234185
                 odds: .33966602 CI: .10732145 1.0750228

                Comment

                Working...
                X