OLS or Probit

John Galvin

Join Date: Feb 2019

Posts: 38
#1

OLS or Probit

04 Aug 2019, 13:55

Dear community,
I conduct a research of how individual level and collective level economic indicators affect public attitudes towards immigration. To clarify, an example of hypothesis: while state suffers from economic downturn, perception of personal financial threat increases probability of negative attitudes.

My dataset is a survey conducted during European recession in seven countries.

My main DV is attitude towards immigration; IV: individual level - economic evaluation, satisfaction with the state of economy, income satisfaction; national level - gdp, taxation change, unemp. rate, unemp. rate change, etc.

I am concerned whether I should use OLS regression or probit regression. I have coded all DV and IV for both types of regressions and statistical results seems fine, however, I still not sure which type of regression to use.

Could you advise me, please?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29795
#2

04 Aug 2019, 15:30

If you use ordinary linear regression, you are fitting a linear probability model. Linear probability models are always, in principle, wrong, because for sufficiently extreme values of the predictors, they predict outcome probabilities outside the 0-1 range. It is also the case that when they predict values between 0 and 1 but close to either end of the interval, they are often appreciably miscalibrated. Nevertheless, if in your data, the linear model's predictions for the actual data do not do that, and do not come close to 0 or 1, but rather are well within the center of the 0-1 interval, then these models may be satisfactory. They have the advantage that regression coefficients can be interpreted directly as marginal effects in the probability metric.

Probit does not suffer from this 0-1 issue: it always makes predictions in 0-1. However, the coefficients of a probit regression are not marginal effects on probability. They are marginal effects on the normal ogive, which is, for most people, incomprehensible. For that reason, in some disciplines, probit regressions are little used: the results just don't lend themselves to interpretation beyond the signs of rthe effects. You can, of course, use the -margins- command to get marginal effects on outcome probability, but because the probit model is non-linear, you have to specify the values of the predictors at which you want to estimate the marginal effects.

Another model to consider is the logistic model. Like probit, it presents no 0-1 problem. Its coefficients are interpretable as the logarithms of odds ratios, which is something that most people can, after a little practice or training, wrap their minds around.
1 like
Comment

John Galvin

Join Date: Feb 2019
Posts: 38

05 Aug 2019, 08:05

Dear Clyde, thank you for your advise! Indeed I would apply logistic model, however, in this case all predictors should be binary. In my case I should use national economic indicators which are complicated for binary coding. I was thinking to recode national indicators like GDP growth as binary (1 for positive 0 for negative) and unemployment rate, inflation rate (coded vice versa). However, this way makes me feel concerned regards its correctness.

For now I have three different variable models for regressions as follows: |

Summary statistics (binary model)

	Mean	St.Dev	min	max	N
Immigration attitude (binary)	.679 \|	.467 \|	0 \|	1	28408
GDP per cap. in thousands	39.525 \|	4.023 \|	31.933 \|	46.419 \|	29054
GDP growth rate(%)/100	.013 \|	.021 \|	-.044 \|	.057 \|	29054
Unemp. rate(%)/100	.086 \|	.04 \|	.037 \|	.199 \|	29054
Unemp. change rate (%)/100	.058 \|	.155 \|	-.131 \|	.366 \|	29054
Taxation per income change rate(&)/100	-.021 \|	.044	-.069 \|	.077 \|	29054
Income evaluation (binary)	.47 \|	.499	0 \|	1 \|	23832
Economic satisfaction (binary)	.344 \|	.475	0 \|	1 \|	28615
Income satisfaction (binary)	.834 \|	.373	0 \|	1 \|	28824
Age	48.527 \|	18.607 \|	15 \|	123 \|	28973
Gender (binary)	.478 \|	.5 \|	0 \|	1 \|	29045
Education (binary)	.145 \|	.353 \|	0 \|	1 \|	28980
Soc. class 5 categories	1.81 \|	1.412 \|	0 \|	4 \|	26363

Summary statistics (linear model)

	Mean	St.Dev	min	max	N
Immigration attitudes 0 - 10	5.37	2.041	0	10	28909
GDP per cap. in thousands	39.525 \|	4.023 \|	31.933	\| 46.419	29054
GDP growth rate in %	1.259 \|	2.106 \|	-4.405	\| 5.703	29054
Unemp. rate in %	8.556 \|	3.961 \|	3.655 \|	19.86 \|	29054
Unemp. change rate (%)	5.802 \|	15.496 \|	-13.09 \|	36.593 \|	29054
Taxation per income change rate	-2.093 \|	4.432 \|	-6.919 \|	7.717 \|	29054
Income evaluation 1-10	5.324	2.83	1	10	23832
Economic satisfaction 1-10	4.345	2.414	0	10	28615
Income satisfaction 0-3	2.147	.787	0	3	28824
Age	48.527 \|	18.607 \|	15	123	28973
Gender (binary)	.478	.5	0	1	29045
Education (binary)	.145	.353	0	1	28980
Soc. class 5 categories	1.81	1.412 \|	0	4	26363

Summary statistics (linear model: most variables recoded for range 0-min 1-max)

	Mean	St.Dev	min	max	N
Immigration attitude (range 0-1)	.454	.22	0	1	4407
GDP per cap. in thousands	36.214 \|	.2	36.016 \|	36.416 \|	4436
GDP growth rate(%)/100	.007 \|	.01 \|	-.003 \|	.017 \|	4436
Unemp. rate(%)/100	.067 \|	.011 \|	.056 \|	.078 \|	4436
Unemp. change rate (%)/100	.05 \|	.017 \|	.033 \|	.067 \|	4436
Taxation per income change rate(&)/100	-.013 \|	.011 \|	-.024 \|	-.003 \|	4436
Economic satisfaction (range 0-1	.323 \|	.205 \|	0	1	4351
Income evaluation (range 0-1)	.514	.3	.1	1	3640
Income satisfaction (0-0.75)	.535 \|	.204 \|	0	.75 \|	4389
Age	50.159 \|	18.803 \|	15	123 \|	4403
Gender (binary)	.441 \|	.497 \|	0	1	4427
Education (binary)	.099 \|	.299 \|	0	1	4420
Soc. class 5 categories	1.704 \|	1.461 \|	0	4	4207

I still debate with myself if I choose linear model, which variables I should apply: those which takes original range like 0-10 or those which were recoded by division up to 0-1 range?
From your professional, perspective what do you think after looking at these three tables of same variables but recoded for different model, which recoding looks more appropriate?

Kind regards,
John

Last edited by John Galvin; 05 Aug 2019, 08:12.

Comment

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#4

05 Aug 2019, 12:03

Logit works just fine with dummy rhs variables. However, it is seldom a good idea to take a continuous variable like change in GNP and make it dichotomous. If you want to allow for different parameters on positive and negative growth, you could create your dummy and interact it with the continuous variable.
1 like
Comment
Enrico Azzini

Join Date: Jul 2020

Posts: 79
#5

31 Jan 2022, 08:35

Hi everyone, I was looking for a posts that would help me to interpret the coefficient of a probit model. Reading this posts I understood how to interpret the coefficient but now I wonder: how can I control if "the linear model's predictions for the actual data do not come close to 0 or 1"? Thaks for your attention
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#6

31 Jan 2022, 08:39

Enrico Azzini Care to give a reproducible example? I don't know what you mean here.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17601
#7

31 Jan 2022, 08:40

Enrico:
with no details at all (please note that, being a regular poster, you should be familiar with the FAQ) it is impossible to reply positively.
As usual, please share what you typed and what Stata gave you back. Thanks.

Kind regards,
Carlo
(StataNow 18.5)
Comment

	Mean	St.Dev	min	max	N
Immigration attitude (binary)	.679 \|	.467 \|	0 \|	1	28408
GDP per cap. in thousands	39.525 \|	4.023 \|	31.933 \|	46.419 \|	29054
GDP growth rate(%)/100	.013 \|	.021 \|	-.044 \|	.057 \|	29054
Unemp. rate(%)/100	.086 \|	.04 \|	.037 \|	.199 \|	29054
Unemp. change rate (%)/100	.058 \|	.155 \|	-.131 \|	.366 \|	29054
Taxation per income change rate(&)/100	-.021 \|	.044	-.069 \|	.077 \|	29054
Income evaluation (binary)	.47 \|	.499	0 \|	1 \|	23832
Economic satisfaction (binary)	.344 \|	.475	0 \|	1 \|	28615
Income satisfaction (binary)	.834 \|	.373	0 \|	1 \|	28824
Age	48.527 \|	18.607 \|	15 \|	123 \|	28973
Gender (binary)	.478 \|	.5 \|	0 \|	1 \|	29045
Education (binary)	.145 \|	.353 \|	0 \|	1 \|	28980
Soc. class 5 categories	1.81 \|	1.412 \|	0 \|	4 \|	26363

Announcement

Comment

Comment

Comment

Comment

Comment

Comment