Vertex with a probit model

Marry Lee

Join Date: Nov 2020

Posts: 189
#1

Vertex with a probit model

14 Apr 2022, 05:55

Hello

I am testing the effect of change in income on a binary dependent varaible in a probit model
I am using the change in income and the quadratic term of chnage in income, and I want to calculate the turning point for change in income.

In this forum, most of those who asked questions about this topic were using a linear model (regress y x x² )

Can you please tell me if with probit, wa also get the vertex by just typing

Code:

nlcom -_b[change_mean_hhInc]/(2*_b[c.change_mean_hhInc#c.change_mean_hhInc])

Or do we need some other modifications for the value we get from this command?

Below you find the graph from the following command:

Code:

quietly probit M_P c.change_mean_hhInc##c.change_mean_hhInc margins, at(change_mean_hhInc = (-1.127198 -0.6972904 -0.5416546 -0.4285192 -0.3086205 -0.2537069 -0.2098856 -0.1504059 -0.1019115 -0.0572195 0.0002527 0.0591002 0.1088343 0.1564083 0.2028532 0.2605009 0.4123831 0.6492863)) marginsplot

Attached Files

vertex.gph (8.4 KB, 1 view)
Tags: None
Marry Lee

Join Date: Nov 2020

Posts: 189
#2

14 Apr 2022, 13:55

Anyone has any remarks please? Thank you.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#3

14 Apr 2022, 14:00

Because you are not using a linear model, the curvilinear relationship you fit will not make the outcome probability a quadratic function of change in income. The intervening probit link function will somewhat distort the parabola into a somewhat different looking curve. Nevertheless, the -_b[linear_term]/(2*_b[quadratic_term]) calculation will still identify the location on the horizontal axis of the turning point in the curve. This is true because of the chain rule and the fact that the probit link is monotonic.
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#4

14 Apr 2022, 14:12

Thank you Clyde Schechter for your answer.
So you mean by this

Because you are not using a linear model, the curvilinear relationship you fit will not make the outcome probability a quadratic function of change in income.

that it will be wrong to make any conclusions on the quadratic term in the probit model? so if I find a positive coefficient on the income change and a negative coefficient on the income change squared, I cannot talk about a positive but a decreasing effect on the probability of the outcome?

You also say:

The intervening probit link function will somewhat distort the parabola into a somewhat different looking curve.

So I don't really need to present the graph with the predicted values of outcome probability at values of income change, since the shape will not be the right one. Also for the vertex, it will not be interesting to calculate since it will be the one of a wrong curve, right?
Thank you again!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#5

14 Apr 2022, 14:28

If your goal is to specifically identify a quadratic relationship between income change and outcome probability, then don't use a probit model. Do a linear probability model with linear and quadratic income terms. That will do that. But that seems an odd thing to do: very few things in the real world actually follow a quadratic equation. Usually when we add quadratic terms to a model it is because we are trying to identify a U or inverted-U shaped relationship, or, less often, just trying to demonstrate some kind of non-linearity in the relationship. So it would surprise me to learn that it really matters to your research goal whether the graph of outcome probability vs income change actually is a parabola. If it does, then you shouldn't use a probit model; use a linear probability model including linear and quadratic terms for income change.

I imagine, however, that you are just interested in U (or inverted-U) or curvilinearity. If that's right, the intervening probit link doesn't really matter. Certainly, if you have a substantial coefficient for the quadratic term, that will give you support for curvilinearity. Whether you really have a true U (or inverted-U) relationship is another matter: there are a lot of non-quadratic curvilinear functions that, over limited ranges, show a good fit to a quadratic regression model. (y = log(x) is a good example of that.) So to establish a U-shaped relationship you really need to see that you have an increasing outcome probability with small values of income change leading up to a turning point and then followed by a decreasing relationship. There are quantitative approaches to doing that, but they are rather complicated, and I think that the best way to do this is to look at a plot of the data overlaid on a graph of the fitted curve.

Also, while the shape of the fitted curve from a probit model will not be a parabola, it will nevertheless, if you have a turning point that is inside the range of the data, start out increasing, reach a maximum (at the same point where a true parabola would) and then decrease. Depending on details of the data, that curve may also show other types of curvature, and it may not be symmetric. It could be flattened out enough to look very little like a parabola. But it will have a turning point in "the right place."
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#6

14 Apr 2022, 15:00

Thank you Clyde Schechter for your great explanation (as always).

In fact, I built a theoretical model, this model tells me that I need to include income change and its square.
I test this model with a probit model, I get a positive coefficient for the income change variable and a negative coefficient for the income change variable squared, so I have an inverted U shaped relationship.
My objective is just to interpret these results. So I think I am not trying to show that the relation is exactly quadratic but rather that there is non-linearity.

So is it safe to say that the following about my results:
A positive income change increases the probability to go to college. However, this effect significantly decreases with higher levels of income change?

When I wanted to present the graph, it was not to show that it is a parabola, but just to show the shape of the relationship (but apparently, with probit we don't get the real relationship so it is better not to present a graph).

Finally, my supervisor wants me to present a vertex, hence my question about it and about whether with probit we get the real maximum where there is a turning point. Your answer in #3 was yes we do. But now that I see your answer in #5, I wonder if there may be more than one maximum? and if yes whether Stata gives me the highest maximum?

You say:

So to establish a U-shaped relationship you really need to see that you have an increasing outcome probability with small values of income change leading up to a turning point and then followed by a decreasing relationship. There are quantitative approaches to doing that, but they are rather complicated, and I think that the best way to do this is to look at a plot of the data overlaid on a graph of the fitted curve.

Can you please tell me how can I plot the data overlaid on a graph of the fitted curve?
Thank you!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#7

14 Apr 2022, 18:47

But now that I see your answer in #5, I wonder if there may be more than one maximum? and if yes whether Stata gives me the highest maximum?

Your fitted curve will exhibit only one maximum, and it will be at the same value of income change as the parabolic (pure quadratic polynomial) curve would be. The reason this is true is that the probit link is monotone increasing. So for any values where the quadratic is increasing, the probit-transformed quadratic one will be as well. For any values where the quadratic is decreasing, the probit-transformed one will be as well. And for the one and only value where the quadratic reaches a maximum, the probit-transformed one will have a unique maximum as well. (Actually, it's all inverse-probit transformed, not probit-transformed--I just got lazy on the typing for a minute there.)

Depending on the details of the data, the resulting fitted curve may look pretty close to a parabola. Or it may not. In fact, it could look like a bell-shaped curve. It could even look like a very flattened bell-shaped curve where the peak is hard to discern. But regardless of how it gets deformed by the probit link, it will have a unique maximum in the "right" place, and it will increase and decrease in the same regions as the quadratic itself does.

I get a positive coefficient for the income change variable and a negative coefficient for the income change variable squared, so I have an inverted U shaped relationship.

Not necessarily. If the value of the turning point for the quadratic lies outside the range of values of income change in the data, then you have only a curve that is, depending on which side of the turning point all the data lies on, an upsloping curve that flattens out a bit, or a downsloping curve that flattens out a bit. What you can assert confidently is that you have a curvilnear relationship. But to say that it is specifically U-shaped you need to do a bit more work. As I have said, I think the best way to handle it is graphically. Of course, since you have a dichotomous outcome, it is somewhat difficult to display the outcome variable graphically. My approach would be to bin the data and thereby aggregate the scatterplot horizontally, and separate the points vertically with jitter.

As you have not shown example data, I'm not going to write exact code for this, but here is the gist of it. I have illustrated it with the auto.dta using foreign as the dichotomous outcome and mpg as the continuous explanatory variable. It's not all that quadratic a relationship, but I'm just trying to illustrate the code you could use. Part of the complication is that

Code:

sysuse auto, clear probit foreign c.mpg##c.mpg headroom // SPLIT THE MPG VARIABLE INTO 10 BINS OF ABOUT EQUAL CARDINALITY pctile cutoffs = mpg, nq(9) // ASSIGN EACH OBSERVATION TO THE APPROPRIATE BIN # xtile mpg_group = mpg, nq(9) // IDENTIFY THE MIDPOINT OF MPG IN EACH BIN summ mpg, meanonly gen midpoints = 0.5*(cutoffs[1] + `r(min)') in 1 replace cutoffs = `r(max)' in 10 forvalues i = 2/9 { replace midpoints = 0.5*(cutoffs[`i'-1] + cutoffs[`i']) in `i' } // ASSIGN TO EACH OBSERVATION THE MPG-MIDPOINT OF ITS BIN // AND CALCULATE THE PROBABILITY OF OUTCOME IN EACH BIN gen mpg_midpoint = midpoints[mpg_group] by mpg_midpoint, sort: egen bin_probability = mean(foreign) // CALCULATE PREDICTIVE MARGINS AT EACH BIN MIDPOINT levelsof midpoints, local(midpoints) margins, at(mpg = (`midpoints')) // PLOT THE MARGINS AND OVERLAY A SCATTERPLOT marginsplot, addplot(scatter bin_probability mpg_midpoint)

It's not a particularly pretty graph, and these particular data do not have a convincing quadratic relationship, but if you want you can beautify it a bit with various -twoway- options, nearly all of which are accepted by -marginsplot-. And you don't necessarily need to show this graph in your presentations anyway: the purpose is for you to visually observe the data and fitted curve to verify that the curve looks like a reasonable representation of the data and that you do see an increasing trend on the left, a peak somewhere in between, and then a decrease on the right. (The example graph shown above does not actually have a U-shaped relationship, so it won't look like that. It's just curvilinear. The point of that code is to just illustrate the approach. You will need to adapt it to your actual data.)

Last edited by Clyde Schechter; 14 Apr 2022, 18:49.
1 like
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#8

15 Apr 2022, 01:26

Thank you so much Clyde Schechter !! This is greatly helpful.
Comment

Announcement

Vertex with a probit model

Comment

Comment

Comment

Comment

Comment

Comment

Comment