Marginal Effects after Mlogit with big data

Tiffany Theresia

Join Date: May 2018

Posts: 4
#1

Marginal Effects after Mlogit with big data

30 May 2018, 02:10

Hello Stata Forum,

I'm running a multinomial logit model with over 16,000,000 observations and 14 different variables. The multinomial logit runs without issue, even though it still takes 30-60 minutes.

However, in attempting to use margins commands, it takes hours to run. I already resample my data into over 3,000,000 observations, but it seems stata takes hours as well to come up with the results. If anyone has any insight on big data sets, nonlinear probability models, and margins/mfx compute, it would be much appreciated.

And also is there any way to run margins for each outcome without run the mlogit command first?

Thanks
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4935
#2

30 May 2018, 06:47

If you can live without the standard errors, adding the -nose- option to margins will probably speed things up considerably.

I'm not sure what you men by " is there any way to run margins for each outcome without run the mlogit command first?" If you mean predict outcome 1, then predict outcome 2, etc. then you only need to run mlogit once. And if you are running Stata 14.2 or later margins will do all the outcomes for you.

Incidentally, I like to use the spost13 mtable command, especially for things like ologit and mlogit. The output looks tidier, for one thing. For more details, see

https://www3.nd.edu/~rwilliam/stats3/Margins05.pdf

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#3

30 May 2018, 08:55

Tiffany,

In addition to what Richard said, -margins- is inherently pretty slow in this application. One obvious way around this is to throw money at the problem - buy Stata MP or add more cores to your existing Stata MP license, get faster processors, etc. But obviously that doesn't help you right now!

Pardon me if you know this already, but you can use -estimates store- to store models in memory, or -estimates save- to save them to disk. This isn't needed to run a bunch of consecutive -margins- commands on one model. However, if you think of some other way to present margins later on, you can restore the original model estimates later and run -margins-, e.g.

Code:

mlogit y x1 x2 x3 estimates store model1 margins x1, predict(outcome(1)) margins x1, predict(outcome(2)) /*You run a bunch more models, then you have an eureka moment*/ estimates restore model1 margins x1#x2, predict(outcome(1)) margins x1#x2, predict(outcome(2))

For commands where I know things will take a long time, I often set them up to run overnight. Again, pardon me if you already knew this.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 742
#4

30 May 2018, 13:11

Tiffany: Are you able to post the summary statistics from your estimation sample (sum, d) ? I have an idea that will work for some data structures but not others and I'll know from your summary stats if it would work for your case.
Comment

Announcement

Marginal Effects after Mlogit with big data

Comment

Comment

Comment