Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Run time for estimating variance matrix in user-written MLE context

    Colleagues,

    I have written an MLE routine that is quite time intensive. I've improved it as best I can, but I have recently realized that estimating the variance-covariance matrix consumes about half of total the run time. I'm wondering if this is unusual.

    Here's a little more detail: The model being estimated has N parameters, one for each of a sample of N observations. (It's an obscure model, whose details I will omit here.) What is relevant, I think, is that I am estimating it in Mata with -optimize()- , with a d1 LL evaluator written completely in Mata. I obtain the variance-covariance matrix in Mata with
    Code:
    V = optimize_result_V_oim(S)
    Now, I understand that computation time for V must be at least O(N^2), so I'm not surprise that this routine gets quite slow as N gets large. But there are calculations in the LL evaluator that involve multiplication of N X N matrices, so I'm surprised that calculating V is taking as much time as (say) 3-4 calls to the LL evaluator. Is this indicative or something weird, or is this just what I should expect from the expense of calculating and inverting the Hessian?

    Regards, Mike

  • #2
    I think I discovered the answer to my own question: I was not thinking about the fact that the second derivatives are estimated numerically with a d1 evaluator. What brought this to mind -- a good lesson-- was what I saw when I obtained detailed timings for some sections of code, I noticed that getting the variance matrix approximately doubled the number of times the code to calculate the gradient was evaluated, which is intensive in my code.

    Comment


    • #3
      You don't mention it, but I guess you use the Newton-Raphson (NR) algorithm. You might consider to change algorithm and switch to BFGS (or combine NR and BFGS), since BFGS uses an approximation of the Hessian based on the gradient.

      Comment


      • #4
        As it happens, I do my iteration by starting off with 1 NR rep, and change to bfgs. Where I'm seeing the time cost is when, at then end of the iteration, when the -optimize_result_V_oim(S)- is issued. Do you mean that, if one uses bfgs, Stata uses its approximation of the Hessian, rather than using one based on the gradient? The things I'm seeing in my timings make me think the variance matrix is coming from a Hessian estimated from the gradient, but I'm at the edge of my knowledge here.

        Thanks for thinking about this. - Mike

        Comment

        Working...
        X