Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrap and Monte Carlo

    Hi all,

    Because my old post is too complicated, I wrote an easy code to explain my main problem.

    First, the output is the same from the second line to the end. So there must be something wrong with my code structure.

    thetae thetay1
    .4231397 .1541079
    .2788612 .191971
    .2788612 .191971
    .2788612 .191971
    .2788612 .191971

    Second, This code runs too slowly when I simulate like 10000 times, any suggestions to make it more efficient?

    Sincerely yours,
    Fenndy


    program progone, rclass
    args g0 g1 g2
    confirm var `g0'
    confirm var `g1'
    confirm var `g2'
    regress `g0' `g1' `g2'
    lincom _b[`g1']-_b[`g2']
    return scalar d2=r(estimate)
    return scalar q2=r(se)
    end

    program progtwo, rclass
    args t
    drop _all
    set obs `t'
    egen time=seq()
    tsset time
    gen x=invnorm(runiform())
    gen v1=invnorm(runiform())
    gen v2=invnorm(runiform())
    gen v3=invnorm(runiform())
    gen ee=invnorm(runiform())
    gen y=x+v1+v2+ee
    gen y1=x+v1
    gen y2=x+v2+v3

    bootstrap dd2=r(d2) qq2=r(q2), reps(100) seed(11) saving(bootstrapsample,replace) nodots: progone y y1 y2
    use bootstrapsample, clear
    sum dd2, meanonly
    return scalar c1=r(mean)
    sum qq2, meanonly
    return scalar c2=r(mean)
    end
    simulate thetae=r(c1) thetay1=r(c2), seed(101) reps(5): progtwo 50

  • #2
    The reason why you get the same number multiple times is that you set the seed in bootstrap. It is enough to set the seed only once in simulate.

    I have speeded your code up a bit, but I don't think it will represent a major improvement, as the main reason for it being slow is unavoidable. Doing a Monte Carlo simulation of a bootstrap estimator is always going to be very slow; the bootstrap as you set it up will estimate your model a 100 times, and you want to repeat that 10,000 times, so you end up estimating your model 1,000,000 times.

    Code:
    clear all
    program progone, rclass
    args g0 g1 g2
    confirm var `g0'
    confirm var `g1'
    confirm var `g2'
    regress `g0' `g1' `g2'
    // minor speed up by avoiding -lincom-
    // Warning: there is a tradeoff between a minor speed increase and
    // an increased probability of introducing bugs. So in general I
    // would strongly prefer -lincom-
    return scalar d2=_b[`g1']-_b[`g2']
    tempname v
    matrix `v' = e(V)
    return scalar q2=sqrt(_se[`g1']^2 + _se[`g2']^2 ///
                          - 2*el(`v', rownumb(`v',"`g1'"),colnumb(`v', "`g2'")))
    end
    
    program progtwo, rclass
    args t
    drop _all
    set obs `t'
    gen time=_n // -egen- is slower than doing it from first principle
    gen x =rnormal() // invnorm(uniform()) is superceded
    gen v1=rnormal()
    gen v2=rnormal()
    gen v3=rnormal()
    gen ee=rnormal()
    gen y=x+v1+v2+ee
    gen y1=x+v1
    gen y2=x+v2+v3
    
    bootstrap dd2=r(d2) qq2=r(q2), reps(100) : progone y y1 y2
    return scalar c1=_b[dd2] // no need to open a dataset to compute the means
    return scalar c2=_b[qq2] // the means are reported directly
    end
    
    simulate thetae=r(c1) thetay1=r(c2), seed(101) reps(5): progtwo 50
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Originally posted by Maarten Buis View Post
      The reason why you get the same number multiple times is that you set the seed in bootstrap. It is enough to set the seed only once in simulate.

      I have speeded your code up a bit, but I don't think it will represent a major improvement, as the main reason for it being slow is unavoidable. Doing a Monte Carlo simulation of a bootstrap estimator is always going to be very slow; the bootstrap as you set it up will estimate your model a 100 times, and you want to repeat that 10,000 times, so you end up estimating your model 1,000,000 times.

      Code:
      clear all
      program progone, rclass
      args g0 g1 g2
      confirm var `g0'
      confirm var `g1'
      confirm var `g2'
      regress `g0' `g1' `g2'
      // minor speed up by avoiding -lincom-
      // Warning: there is a tradeoff between a minor speed increase and
      // an increased probability of introducing bugs. So in general I
      // would strongly prefer -lincom-
      return scalar d2=_b[`g1']-_b[`g2']
      tempname v
      matrix `v' = e(V)
      return scalar q2=sqrt(_se[`g1']^2 + _se[`g2']^2 ///
      - 2*el(`v', rownumb(`v',"`g1'"),colnumb(`v', "`g2'")))
      end
      
      program progtwo, rclass
      args t
      drop _all
      set obs `t'
      gen time=_n // -egen- is slower than doing it from first principle
      gen x =rnormal() // invnorm(uniform()) is superceded
      gen v1=rnormal()
      gen v2=rnormal()
      gen v3=rnormal()
      gen ee=rnormal()
      gen y=x+v1+v2+ee
      gen y1=x+v1
      gen y2=x+v2+v3
      
      bootstrap dd2=r(d2) qq2=r(q2), reps(100) : progone y y1 y2
      return scalar c1=_b[dd2] // no need to open a dataset to compute the means
      return scalar c2=_b[qq2] // the means are reported directly
      end
      
      simulate thetae=r(c1) thetay1=r(c2), seed(101) reps(5): progtwo 50

      Thank you so much, dear Mr.Maarten Buis. I very appreciate your comment and it works. But I don't understand why I need to drop seed in bootstrap. Since seeds in simulate command controls how to generate the estimation data and seed in bootstrap controls how to resample from estimation data. Would you mind explaining it?

      Comment


      • #4
        Both bootstrap adn simulate use the same random number generator, which are controled by the same seed. If you set it inside bootstrap you will get the results you found: repitions of the same number.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          OK, got it. Thanks so much. Appreciate.

          Comment

          Working...
          X