Question about speed of reg versus reghdfe

Austin Yi

Join Date: Nov 2021

Posts: 17
#1

Question about speed of reg versus reghdfe

13 Feb 2022, 14:10

Hi Stata,

I am using STATA MP 8 core version on Mac (RAM 16G) and running a regression with FEs. The regression for using reghdfe is like:

reghdfe IHS ib7.group1event ib7.group2event ib7.group3event ib7.group4event ib7.group5event ib7.group6event ib7.group7event ib7.group8event ib7.group9event [aw=population], absorb(i.week i.group i.state) cluster(state)

The data I am using has 20 million obs. I have managed to compress it to the size of about 300M. However, reghdfe causes out of memory issue and my system asks me to quit Stata (I have tried options: poolside(1) compact, still out of memory). Even when I run reghdfe on 1/10 of the data (about 2 million), it takes about 10 mins to finish. But when we I do the same regression using reg command with the full sample, it returned results in 30 seconds. Any intuition of what is going on here? I thought reghdfe is designed to make regression with many FEs faster, but it doesn't look like that here. I appreciate any thoughts.
Tags: None
Tiago Pereira

Join Date: Jan 2016

Posts: 365
#2

13 Feb 2022, 14:53

Quick question: have you tried to run -reghdfe- with Stata in a batch model? E.g., using terminal?
Comment
Austin Yi

Join Date: Nov 2021

Posts: 17
#3

13 Feb 2022, 16:27

Originally posted by Tiago Pereira View Post

Quick question: have you tried to run -reghdfe- with Stata in a batch model? E.g., using terminal?

Hi Tiago, Thanks for replying. I am not familiar with the batch model. Could you explain a little more? thanks!
Comment
Tiago Pereira

Join Date: Jan 2016

Posts: 365
#4

13 Feb 2022, 17:53

Hi, Austin.

1. Save you do-file. Let's call it script.do and let's assume you save it on the following path /home/Desktop/my_folder
2. You need to know where the Stata executables are on your computer. Let's assume they are on the following path /usr/local/stata16/
3. Open the terminal and type

Code:

cd /home/Desktop/my_folder /usr/local/stata16/stata-mp do "script.do"

Stata will be run on the "batch" mode, and, at least based on my humble experience, that approach can handle better larger datasets.

Last edited by Tiago Pereira; 13 Feb 2022, 17:57.
2 likes
Comment
Austin Yi

Join Date: Nov 2021

Posts: 17
#5

14 Feb 2022, 10:51

Any suggestions/thoughts on why reg is faster than reghdfe in this case?
Comment

Announcement

Question about speed of reg versus reghdfe

Comment

Comment

Comment

Comment