Seeking Advice for Running Heckman Selection Model with High Dimensionality in Stata

Leona Lin

Join Date: Jan 2024

Posts: 1
#1

Seeking Advice for Running Heckman Selection Model with High Dimensionality in Stata

29 Jan 2024, 07:22

Dear Statalist Community,

I am currently working with the Heckman selection model in Stata and facing a challenging situation. My model includes a substantial number of two-way fixed effects, leading to a total of over 4,000 variables. Additionally, the dataset I am working with contains more than 85,000 observations. Due to this high dimensionality, I am unable to obtain results from Stata.

Could anyone kindly provide guidance or suggestions on how to handle such a large-scale model effectively? Specifically, I am looking for advice on:
1. Strategies to manage a large number of variables in the context of the Heckman model.
2. Approaches to dealing with large datasets while ensuring the robustness and validity of the model.

Any insights, tips, or references to relevant resources would be greatly appreciated. Thank you in advance for your time and assistance.

Sincerely,

Leona
Tags: data, Heckman selection model, Suggestion
FernandoRios

Join Date: Apr 2014

Posts: 2534
#2

29 Jan 2024, 11:44

I think you have couple of options
Depending on the nature of your two way fixed effects (are both dimensions high dimensional?) you could apply xtheckmanfe (SSC), which is an estimator proposed by Wooldridge (see the command helpfile for the reference)
Otherwise, you could apply a kind of Correlated Random effect model too. This is similar to xtheckmanfe but with more constrains on the specification.
Otherwise, keep in mind that high dimensional nonlinear models are difficult to estimate, and perhaps your problem is of a nature that may not require those fixed effects.
F
Comment

Announcement

Seeking Advice for Running Heckman Selection Model with High Dimensionality in Stata

Comment