Run time for estimating variance matrix in user-written MLE context

Mike Lacy

Join Date: Apr 2014

Posts: 2404
#1

Run time for estimating variance matrix in user-written MLE context

11 Dec 2015, 16:27

Colleagues,

I have written an MLE routine that is quite time intensive. I've improved it as best I can, but I have recently realized that estimating the variance-covariance matrix consumes about half of total the run time. I'm wondering if this is unusual.

Here's a little more detail: The model being estimated has N parameters, one for each of a sample of N observations. (It's an obscure model, whose details I will omit here.) What is relevant, I think, is that I am estimating it in Mata with -optimize()- , with a d1 LL evaluator written completely in Mata. I obtain the variance-covariance matrix in Mata with

Code:

V = optimize_result_V_oim(S)

Now, I understand that computation time for V must be at least O(N^2), so I'm not surprise that this routine gets quite slow as N gets large. But there are calculations in the LL evaluator that involve multiplication of N X N matrices, so I'm surprised that calculating V is taking as much time as (say) 3-4 calls to the LL evaluator. Is this indicative or something weird, or is this just what I should expect from the expense of calculating and inverting the Hessian?

Regards, Mike
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#2

14 Dec 2015, 07:42

I think I discovered the answer to my own question: I was not thinking about the fact that the second derivatives are estimated numerically with a d1 evaluator. What brought this to mind -- a good lesson-- was what I saw when I obtained detailed timings for some sections of code, I noticed that getting the variance matrix approximately doubled the number of times the code to calculate the gradient was evaluated, which is intensive in my code.
Comment
Christophe Kolodziejczyk

Join Date: Mar 2014

Posts: 377
#3

15 Dec 2015, 01:24

You don't mention it, but I guess you use the Newton-Raphson (NR) algorithm. You might consider to change algorithm and switch to BFGS (or combine NR and BFGS), since BFGS uses an approximation of the Hessian based on the gradient.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#4

15 Dec 2015, 10:05

As it happens, I do my iteration by starting off with 1 NR rep, and change to bfgs. Where I'm seeing the time cost is when, at then end of the iteration, when the -optimize_result_V_oim(S)- is issued. Do you mean that, if one uses bfgs, Stata uses its approximation of the Hessian, rather than using one based on the gradient? The things I'm seeing in my timings make me think the variance matrix is coming from a Hessian estimated from the gradient, but I'm at the edge of my knowledge here.

Thanks for thinking about this. - Mike
Comment

Announcement

Run time for estimating variance matrix in user-written MLE context

Comment

Comment

Comment