Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How confidence interval is estimated within ratio command

    Hi all,

    I have been trying to figure out how STATA estimates the 95% CI within ratio command.
    Or alternatively how I can extract the CI as estimated by the ratio.
    I know that the variance estimation and CI are quite complex in ratio estimation, therefore I would like to rely on STATA estimates, especially since I am often using ratio under svy settings.
    The retrieval of the CI estimates is part of writing my own tabulation type command and I would like to either re-estimate manually with retrieved SE estimates or access the stored CIs.
    I have canvased the manuals, but the method of CI estimation used within ratio command is not explained.
    I am using STATA 14.1.
    command:
    use http://www.stata-press.com/data/r14/census2
    ratio (deathrate: death/pop) (marrate: marriage/pop)

    Any help in this regard would be highly appreciated.

    Andrej

  • #2
    Not so useful responding to my own post, but I had limited success that I wish to share.
    I still haven't been able to identify the formula or the methodology STATA is using to obtain CI for ratios.
    I did manage a workaround for my purpose.
    I wasn't aware of this option before and I wished to share it. Some (all) tabular results are stored in return matrix called r(table)
    So I retrieved the results produced by ratio command and used them in my own programming.
    I though I would share this with the rest as it is the first time I came across this option.
    Code:
    matrix list r(table)


    Comment


    • #3
      The Methods and Formulae for ratio in the manuals show the variance formula for the svy case (from around page 2076). What else do you need to know to figure out the calculation of the 95% CI? And are you sure that r(table) is what you want rather than what is saved in e(b) and e(V) after ratio? It is an estimation command, and you appear to be wanting to write an estimation command. It appears that r(table) is a convenience output that puts together output from the other saved results.

      Code:
      . use http://www.stata-press.com/data/r14/census2
      (1980 Census data by state)
      
      .
      . ratio (deathrate: death/pop) (marrate: marriage/pop)
      
      Ratio estimation                  Number of obs   =         50
      
          deathrate: death/pop
            marrate: marriage/pop
      
      --------------------------------------------------------------
                   |             Linearized
                   |      Ratio   Std. Err.     [95% Conf. Interval]
      -------------+------------------------------------------------
         deathrate |   .0087368   .0002052      .0083244    .0091492
           marrate |   .0105577   .0006184       .009315    .0118005
      --------------------------------------------------------------
      
      . eret list
      
      scalars:
                     e(df_r) =  49
                   e(N_over) =  1
                        e(N) =  50
                     e(k_eq) =  1
                     e(rank) =  2
      
      macros:
                  e(cmdline) : "ratio (deathrate: death/pop) (marrate: marriage/pop)"
                      e(cmd) : "ratio"
                      e(vce) : "linearized"
                  e(vcetype) : "Linearized"
                    e(title) : "Ratio estimation"
                 e(namelist) : "deathrate marrate"
                e(estat_cmd) : "estat_vce_only"
                  e(varlist) : "death pop marriage pop"
             e(marginsnotok) : "_ALL"
               e(properties) : "b V"
                   e(depvar) : "Ratio"
      
      matrices:
                        e(b) :  1 x 2
                        e(V) :  2 x 2
                       e(_N) :  1 x 2
                    e(error) :  1 x 2
      
      functions:
                   e(sample)  
      
      . mat list e(V)
      
      symmetric e(V)[2,2]
                  deathrate     marrate
      deathrate   4.211e-08
        marrate  -2.691e-08   3.824e-07
      
      . mat list e(b)
      
      e(b)[1,2]
          deathrate    marrate
      y1  .00873682  .01055773
      
      . ret list
      
      scalars:
                    r(level) =  95
      
      macros:
                 r(mcmethod) : "noadjust"
      
      matrices:
                    r(table) :  9 x 2
      
      . mat list r(table)
      
      r(table)[9,2]
              deathrate    marrate
           b  .00873682  .01055773
          se  .00020521  .00061842
           t  42.574529  17.072203
      pvalue  2.279e-40  2.892e-22
          ll  .00832443  .00931498
          ul  .00914921  .01180048
          df         49         49
        crit  2.0095752  2.0095752
       eform          0          0

      Comment


      • #4
        Thanks for the response. It's the method I am looking for. A simple linear estimation of the CI is straightforward of course, but with ratio's what I read in the literature the estimation of the CI should not be linear. Fieler's method is most commonly mentioned is the correct one. I am trying to replicate the method, however I am still not sure that this is what STATA uses. I have tried linear estimation as well as logit but none return the same result as STATA output. I use stored estimates and variances in other cases, of course, to calculate different statistics.

        In the r(table) matrix I am looking for rows 5 and 6 (ll and ul) to retrieve the CI estimates. I agree this is not ideal, but until I can figure our how STATA estimates the interval I am inclined to use those.
        I ave also noted that sometimes STATA returns a lower bound for the ratio estimate which is below 0, which seems counter-intuitive as ratios like death rates or in my case net attendance rates of schools are non-negative.
        If you have any further information how STATA does it, I would really appreciate it.

        Many thanks,
        Andrej

        Comment


        • #5
          Write "Stata" (not "STATA") and you might make people more willing to help! Read the FAQ on this.
          More seriously, if you are going to mention methods, especially those associated with a name ("Fieler"?), you need exact references (see FAQ). [Don't simply say "the literature"!] It would also help to say precisely where you've looked in the manuals and where they don't provide the information you want for your purposes. Moreover, that some commands allow the reporting of CI lower bounds for ratios "less than zero" has, I am sure, been discussed on Statalist -- or at least the closely-related issues of estimating a proportion -- with reference also to commands that use transformations to ensure this (have you searched? ... again see the FAQ on strategies to help others help you).

          Comment


          • #6
            This is from Example 1 in the Manual entry for ratio:

            Code:
            Ratio    Std. Err. [95% Conf. Interval]
            .9230769 .032493 .8515603 .9945936
            Denoting the lower and upper confidence limits by L and U, we have:

            Code:
            U - Ratio = 0.0715
            Ratio - L = 0.0715
            This shows that the CI is based on adding and subtracting a multiple of the standard error from the estimate. Here the multiple must be the .975 quantile of a t distribution with 12 d.f., which you can verify for yourself.
            Last edited by Steve Samuels; 21 Jan 2016, 15:48.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment

            Working...
            X