Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Efficient bootstrapping of moptimize()

    I have a question that I bet any of you can answer, especially if your name is Jeff Pitblado.

    I am writing something to bootstrap results from an moptimize() model fit. The individual estimates run pretty fast, but I'm doing millions of them, in Monte Carlo simulations, so speed matters.

    In each estimation the only thing that changes is a column of data stored in a Mata matrix, which is passed to the likelihood evaluator via moptimize's userinfo facility. So (very cleverly, I thought), I constructed a loop something like this:

    Code:
    bhat = J(reps, K, .) // will hold estimates of K parameters, from reps replications
    
    for (i=reps; i; i--) {
        
        [code to generate random data matrix x for this replication]
        
        moptimize_init_userinfo(M, 1, x) // store the new data matrix in M, an existing moptimize() model object
    
        (void) _moptimize(M) // optimize
        
        if (moptimize_result_converged(M))
            bhat[i,] = moptimize_result_coefs(M)
    }
    I like this because it minimizes housekeeping in each iteration.

    At first it seemed to be working fine, but eventually I realized that there was some hysteresis, that two fits to the same data matrix x might produce different results, at least in that one would converge and one not, depending on the order in which I ran various simulations. So it seems that I am being too clever, that I need to do more to reset the search process before I call _moptimize() again on an already-fitted model.

    So my question is, if I want to efficiently re-fit a model after making a change, what are the steps to take to make sure I don't confuse moptimize()?

    --David
    Last edited by David Roodman; 24 Jan 2015, 16:45.

  • #2
    Jeff, is that because assigning starting values will help the search or because re-setting them will trigger different behavior? As it is, the search appears to start at the same point every time, which is OK.

    Comment


    • #3
      I believe that all depends on wether you specified moptimize_init_search(M, "off").
      With different data, and moptimize_init_search(M, "on") (the default), you might
      observe the starting values changing within your loop.

      There is an undocumented function moptimize_init_coefs(M) that returns the
      starting values used to compute the likelihood in iteration 0.

      Here is a quick example using the logit model.

      Code:
      sysuse auto
      mark touse
      markout touse for mpg turn trunk
      local reps 5
      
      mata:
      
      void mylogit_lf(scalar M, real vector b, real vector ll)
      {
              real    vector  y
              real    vector  xb
      
              xb      = moptimize_util_xb(M, b, 1)
              y       = moptimize_util_depvar(M,1) :!= 0
              y       = 1 :- (2:*y)
      
              ll = log(invlogit(y:*xb))
      }
      
      M = moptimize_init()
      moptimize_init_which(M, "max")
      moptimize_init_evaluator(M, &mylogit_lf())
      moptimize_init_evaluatortype(M, "lf")
      moptimize_init_search(M)
      moptimize_init_search_random(M)
      moptimize_init_touse(M, "touse")
      moptimize_init_depvar(M, 1, "foreign")
      moptimize_init_eq_indepvars(M, 1, "mpg turn trunk")
      bhat = J(`reps', 4, .)
      b0 = J(`reps', 4, .)
      st_view(V=., ., "foreign")
      for (i=1; i<=`reps'; i++) {
              moptimize_init_userinfo(M, 1, i)
              V[,1] = runiform(rows(V),1) :> .5
              (void) _moptimize(M)
              if (moptimize_result_converged(M)) {
                      bhat[i,] = moptimize_result_coefs(M)
              }
              b0[i,] = moptimize_init_coefs(M)
      }
      b0
      bhat
      
      end

      Comment

      Working...
      X