Any tips for maximization techniques?

Richard Williams

Join Date: Apr 2014

Posts: 4900
#1

Any tips for maximization techniques?

15 Apr 2015, 23:11

We are running some models that sometimes fail to converge, e.g. in 1000 simulations 76 did not converge. We have found that changing the maximization options can often work wonders. However, I am wondering if this is all hit or miss or if some strategies tend to be more effective than others.

Specifically, Stata's default is

technique(nr)

When that hasn't worked, we have tried things like

technique(nr bhhh)
technique (nr bhhh dfp bfgs)
technique (nr bhhh dfp bfgs) difficult

All often help, some more than others, but rather than just improvise each time it would be nice to know if some options tend to be more successful than others.

Also, Stata defaults to 16,000 iterations. Has anybody ever actually had a model that converged on iteration 15,974? My own experience is that either a model converges in well under 100 iterations or it doesn't converge at all. My suspicion is that, rather than letting something iterate forever, you are better off limiting it to 100 iterations and then switching to different maximization techniques. But I don't know if my experiences are typical.

In case it matters, I am still using the hopelessly antiquated Stata 13.1. But if somebody tells me that the maximization techniques have dramatically improved in 14 I will upgrade ASAP. The models are being estimated with the sem command. Thanks for any suggestions.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 18.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Tags: None
Andrew Lover

Join Date: Apr 2014

Posts: 182
#2

16 Apr 2015, 00:16

I've kludged many of the same options, but have found that something like:

Code:

technique(bhhh 20 nr 20 dfp 20 bfgs 20) iterate(1000) diff trace technique(bhhh 20 nr 2 dfp 20 bfgs 2) iterate(1000) diff trace

and then examining the trace to see which techniques get less stuck is useful. Hardly systematic or theory-based however...

Last edited by Andrew Lover; 16 Apr 2015, 00:21.

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1422
#3

16 Apr 2015, 02:26

FWIW my kludge that has worked "well" in some of my problems is

Code:

technique(dfp 5 nr 5)

Or sometimes with "10" instead of "5". As with Andrew, my checks have been neither theory-based or systematic. I usually also set maxit(250), for the reasons Rich alludes to in #1. Tracking the gradient etc. via the trace, as Andrew suggests, has also been useful occasionally. I have found in my sorts of problems that the difficult option doesn't always help; it just prolongs the time to non-success.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4900
#4

16 Apr 2015, 04:44

Thanks. I will play around with these. My experience is that difficult sometimes works miracles and sometimes makes things worse. Maybe it should just be called different rather than difficult.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 18.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Roman Mostazir

Join Date: Apr 2014

Posts: 869
#5

29 Nov 2016, 05:04

Originally posted by Andrew Lover View Post

I've kludged many of the same options, but have found that something like:

Code:

technique(bhhh 20 nr 20 dfp 20 bfgs 20) iterate(1000) diff trace technique(bhhh 20 nr 2 dfp 20 bfgs 2) iterate(1000) diff trace

and then examining the trace to see which techniques get less stuck is useful. Hardly systematic or theory-based however...

Stata-version: 14.2,
Any idea on how to define multiple techniques, as red texts in the example above, while doing the estimation within the `stsem' path diagrams/sem graphs? My model converges with what Andrew suggested above when I run it from my do file, but how to define all these option when running the model from the sem estimation interface? It seems we can define only one option (yet without the number, as Andrew did above: 'bhh 20') in the sem estimation interface, which aparantly does not help to converge my model. The 'diff trace' option (when running from do file and the model converges) tells me which technique did the job and then selecting solely that particular technique from the sem estimation interface options does not converge the model. Below are the commands for my model:

Code:

gsem (irln <- age age2 i.sex i.sex#c.age fmp htvel ht M1[id] c.age#M2[id]) /// (mabp <- age age2 i.sex i.sex#c.age fmp htvel ht irln M3[id] c.age#M4[id]), /// cov(M1[id]*M2[id] M3[id]*M4[id]) /// technique(bhhh 20 nr 20 dfp 20 bfgs 20) iterate(1000) diff trace

Any help to speed up the process will be highly appreciated.

Regards,

Roman
Comment

Announcement

Any tips for maximization techniques?

Comment

Comment

Comment

Comment