Linear probability model in propensity score estimation

Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#1

Linear probability model in propensity score estimation

09 Jun 2017, 09:09

I was wondering, why does the teffects psmatch command, only allows to estimate the propensity score by logit or probit? While these are the standard models for probability estimation - I personally quite like the Linear Probability Model (standard OLS with a binary outcome variable) and I assume others have estimated probabilities with different models as well.

See for example the paper Caliendo & Kopeining(2005) fro IZA, which also argues:
"In principle any discrete choice model can be used. Preference for logit or probit models (compared to linear probability models) derives from the well-known shortcomings of the linear probability model, especially the unlikeliness of the functional form when the response variable is highly skewed and predictions that are outside the [0, 1] bounds of probabilities. However, when the purpose of a model is classification rather than estimation of structural coefficients, it is less clear that these criticisms apply (Smith, 1997)."
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

12 Jun 2017, 14:05

http://www.stata.com/support/faqs/st...l-constraints/
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#3

12 Jun 2017, 15:50

I do not understand the relevance to my query...
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#4

07 Oct 2017, 12:07

Anyone?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29958
#5

07 Oct 2017, 14:44

Well, if you ran a linear probability model and it came up with predicted probabilities outside the 0, 1 range, they would be unusable altogether for propensity score weighting. The other thing is that the ability of propensity score matching, under certain assumptions, enables the estimation of a causal effect relies on the propensity score being a valid estimate of probability of being in the group. So, again, if you got predicted probabilities outside the 0, 1 range you could match on them, but the theorem saying that the matched analysis provides an estimate of the causal effect would fail to hold.

If you used a linear probability model, and if all the predicted probabilities fell within the 0, 1 range and the model was well calibrated, then I don't see any reason you couldn't use that. I guess the people who wrote -psmatch- felt that either there wouldn't be much demand for this, or they didn't want to get into what to do if the linear probability model produced unusable results. If you want to use a linear probability model, you can always implement it directly and then do the weighting or matching yourself. Not as convenient, but doable.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3824
#6

07 Oct 2017, 15:27

A non-linear model (e.g., logit) has the additional advantage over of the LPM that the former implicitly takes interactions between all predictors into account. The (marginal) effect of one predictor depends on the levels of all other predictors in the non-linear model. Since the selection process being modeled is most likely not a simple linear (additive) function and given that you usually do not explicitly include any possible interaction effects in your linear model to account for that, the non-linear models are preferred.

Best
Daniel
1 like
Comment

Announcement

Linear probability model in propensity score estimation

Comment

Comment

Comment

Comment

Comment