Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Non-Parametric Bootstrapping

    Hi,
    I am using Stata/IC 16.1 and I would like to calculate the 95%CI for a variable (burden) that is the product of two random variables (using an offset) - count and mean duration with an exposure offset.
    This dataset represents a count of injuries, their cumulative duration and the individual's exposure duration within the surveillance period.

    To account for the uncertainties, I have been recommended to use a non-parametric bootstrapping approach with 1000 iterations as one approach.

    Please see a sample of the dataset below.

    * Example generated by -dataex-.
    clear
    input long id double(injured duration exposure)
    18829 6 268 1461
    18830 6 673 1461
    18832 8 378 1461
    20896 1 81 1128
    20899 4 94 1461
    20902 3 448 761
    20906 5 202 1461
    20909 6 224 1461
    20913 9 556 761
    20914 7 372 1461
    end

    To calculate the burden of the population, the following code has been used previously:

    Code:
    *collapse data to summary level
    collapse (sum) injured duration exposure
     
    *create mean duration variable
    gen duration_mean = duration/injured
     
    *create incidence variable
    gen incidence = injured/exposure*1000
     
    *create burden variable
    gen burden = incidence*duration_mean
    Is there any recommended approach to 1) streamline the approach above, 2) allow <bootstrap> to be ran prior to the collapse?

    Many thanks.
    Last edited by Melissa Crunkhorn; 21 Dec 2023, 23:22.

  • #2
    Melissa:
    assuming that your calculation is (more or less) right till -burden-, yoummay want to consider something along the following lines:
    Code:
    . . quietly sum burden
    
    . bootstrap r(mean), reps(1000): quietly sum burden
    
    Bootstrap results                                        Number of obs =    10
                                                             Replications  = 1,000
    
          Command: summarize burden
            _bs_1: r(mean)
    
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
                 | coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           _bs_1 |   290.4472    68.5281     4.24   0.000     156.1346    424.7598
    ------------------------------------------------------------------------------
    
    . estat bootstrap, all
    
    Bootstrap results                               Number of obs     =         10
                                                    Replications      =       1000
    
          Command: summarize burden
            _bs_1: r(mean)
    
    ------------------------------------------------------------------------------
                 |    Observed               Bootstrap
                 | coefficient       Bias    std. err.  [95% conf. interval]
    -------------+----------------------------------------------------------------
           _bs_1 |   290.44722  -1.439478   68.528096    156.1346   424.7598   (N)
                 |                                       168.4967   432.1902   (P)
                 |                                       177.2014   460.8813  (BC)
    ------------------------------------------------------------------------------
    Key:  N: Normal
          P: Percentile
         BC: Bias-corrected
    
    . 
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment

    Working...
    X