Interpretation of marginal effects of a continuous variable

Hakan Gunduz

Join Date: Dec 2018

Posts: 49
#1

Interpretation of marginal effects of a continuous variable

01 Jan 2019, 15:28

Hi all,

I’ve been reading about this issue and I read pretty much all related topics but I still couldn’t understand the way we interpret the average marginal effect of a continuous variable.

If my model is

Code:

xtprobit y1 x1 x2

where x1 is a continuous independent variable.

I used the following to get the marginal effects:

Code:

margins, dydx(*)

What is the correct interpretation of the marginal effect if I found it -0.09 for x1? Is the following correct if we assume that x1 varies between -0.5 and 0.5:

“The average marginal effect on probability y=1(dichotomous dependent variable) associated with a 1 percentage increase in x1(continuous independent variable) is a 9 percentage point decrease.”

Many thanks

Last edited by Hakan Gunduz; 01 Jan 2019, 15:33.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

01 Jan 2019, 16:34

“The average marginal effect on probability y=1(dichotomous dependent variable) associated with a 1 percentage increase in x1(continuous independent variable) is a 9 percentage point decrease.”

No, that's not correct. It is wrong in two ways. The most important way in which it is wrong is in referring to a 1 percentage increase in x1. The (closer to) correct statement is that on average, the difference in the probability of y = 1 associated with a 1 unit (not percentage) increase in x is a decrease of 9 (not decrease of -9, which would be the same as an increase of 9) percentage points.

That one is close to true. Even so it can be improved. The actual marginal effect is, as suggested by Stata's notation with the -dydx()- option, a derivative. It is analogous to an instantaneous velocity. We often speak loosely of going 30 km in the next hour when we are currently moving at a speed of 30 km/h. But that is not strictly true: if our speed changes as we move, then the total distance we travel will not necessarily be 30 km. A more accurate statement would be that if we continue moving at the same rate we will cover 30 km in 1 h. In fact the rate of change in y associated with a unit increase in x changes as x changes: it is analogous to movement at an ever-changing velocity. So it is not really true that a full unit increase in x will be associated with a -9 percentage point decrease. The more correct statement is that the rate of difference in the probability of y, averaged over all observations in the data sample, is -9 percentage points per unit difference in x.

When you think about this carefully, it becomes apparent that the average marginal effect in a probit (or any other non-linear) model is a statistic of limited usefulness that must be interpreted with great caution. It is an instantaneous rate of change, and it does not even actually apply to the full range of observations in the data--indeed, it might apply only to a few rather exceptional ones. It is the result of trying to take a complicated, multifaceted relationship between y and x and reduce it to a single number. A great deal of information, perhaps the most important information, necessarily gets lost in the process.

In general when I deal with non-linear models involving a continuous variable, I generally do not try to describe them with average marginal effects. I prefer to pick an interesting, important, or representative range of values of the predictor variable(s) and calculate the various marginal effects at those values. Only if those turn out not to differ very much from each other do I feel comfortable describing the process by an average marginal effect. (And most of the time that does not happen.) So I prefer to present graphs or tables of marginal effects at the various values of the predictor(s).
Comment
Hakan Gunduz

Join Date: Dec 2018

Posts: 49
#3

01 Jan 2019, 16:55

Dear Clyde. Thank you very much for this amazing explanation. Initially, I did what you explained in the last paragraph- selecting different points and reporting marginal effects- as it provides an opportunity to compare different levels. However, I realized that two papers I’m following had reported marginal effects for the same continuous variable I have and their explanation is quite insufficient. Besides, my supervisor was interested in seeing a direct explanation on the marginal effects but I can see that there’s no single explanation covering the whole situation. Thank you very much again.

Edit: just a quick question: I’ve always seen the “unit” term in the discussions. Is there a way to say what the actual unit is? For example, in my test the variable varies between -0.5 and 0.5. The unit could be 0.1 or it could be 0.01 etc. How could we know the properties of the actual unit?

Last edited by Hakan Gunduz; 01 Jan 2019, 17:03.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

01 Jan 2019, 17:53

The unit means a change of 1 in the numerical value of x. So if your x variable ranges from -0.5 to 0.5, a unit change in x covers the entire range of that variable.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#5

02 Jan 2019, 01:36

Not that I disagree with Clyde, but in this situation I would simply say:

"A unit increase in the value of x, on average leads to a decrease of 0.09 in the probability of the dependent variable to take the value of 1. "

The only minor difference between how I say it, and how Clyde says it, is that I avoid the "percentage points" terminology.

And interpretations always come from the fact that linear regression estimates

Ey = a + b*x

and when y is binary, Ey=Prob(y=1).
Comment
Hakan Gunduz

Join Date: Dec 2018

Posts: 49
#6

02 Jan 2019, 01:56

Dear Clyde and Joro, thank you for your comments. They've been helpful for me to understand the correct way to interpret the marginal effects. Just one more thing: If I transform my independent variable into percentage, would I be able to say:

"1% percentage increase in the value of x, on average leads to a decrease of 9 percentage of the dependent variable to take value of 1"

The unit value in this situation equals to 1%. Is this how we comment on the percentage variables?
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#7

02 Jan 2019, 02:32

Hakan, there appears to be some very deep semantic and philosophical discussion as to what constitutes a percent, and what constitutes a percentage point. Maybe Clyde can weight in here with more expertise.

To me, 1 percent = 1% = 0.01. In words, the number of the right hand side is 0.01 which I call "decimal representation", and between humans this decimal representation is communicated as 1% (language convenience, so that we do not have to say" zero point zero one" but rather "1 percent").

Therefore in your case I would simply say "1 percent increase in the value of x, on average leads to a decrease of 9 percent in the probability of the dependent variable to take the value of 1"

which in my mind is equivalent to saying

" An increase in x of 0.01 leads on average to a decrease of 0.09 in the probability of y to take the value of 1."
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#8

02 Jan 2019, 02:38

I would say #7 if your independent variable was already in percent and you got the results that you reported in #1.

If you now transform your independent variable, note that your marginal effect will change as well.
Comment

Hakan Gunduz

Join Date: Dec 2018
Posts: 49

02 Jan 2019, 02:58

Dear Joro, you are right. I have now solved the issue and for those who will see this post in the future I'm leaving the following:

I used

Code:

margins, dydx(*)

to get the average marginal effects for my independent variables. Here is what Stata gave me back:

Code:

---------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
x1               |  -.0920784    .046267    -1.99   0.047      -.18276   -.0013968

Then I tried to see the differences between different points:

Code:

 margins, at(X1=(0.1 0.2))
1._at        : Leverage_D~t    =          .1

2._at        : Leverage_D~t    =          .2

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .1014253   .0095029    10.67   0.000     .0827999    .1200506
          2  |    .093177   .0114892     8.11   0.000     .0706586    .1156954
------------------------------------------------------------------------------

Code:

. display(.1014235-.093177)
.0082465

and

Code:

 margins, at(X1=(-.5 .5))

Code:

1._at        : Leverage_D~t    =         -.5

2._at        : Leverage_D~t    =          .5

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .1636319   .0316992     5.16   0.000     .1015026    .2257612
          2  |   .0716736   .0176303     4.07   0.000     .0371189    .1062284

Code:

. display (.1636319-.0716739)
.091958

We can clearly see that an increase in X1 of actual 1 unit leads on average to a decrease of 0.09 in the probability of y to take the value of 1.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#10

02 Jan 2019, 09:18

Let me clarify the usage of percent and percentage point.

A) An x% increase in y means a change from y_0 to y_0 + (x/100)*y_0, y_0 referring to the baseline value of y.

B) An x percentage point increase in y is meaningful only if the unit of measurement of y itself is percent, and in that case it means a change from y_0 to y_0 + x.

Thus if y is itself in units of percent, and starts out at y_0 = 50%, A) a 10% increase in y means a change from 50% to 55% because 5 is 10 percent of 50. On the other hand, B) a 10 percentage point increase in y means a change from 50% to 60% because 50 + 10 = 60.

In a better world, we would all do as Joro Kolev suggests in #5 and simply speak of probabilities as numbers between 0 and 1. But the reality is that many people find this uncomfortable and prefer to cast them as percentages, running from 0 to 100. Indeed, even I lapse into this terminology in my own speaking and writing, though I think poorly of it. It is just ingrained in our discourse. It is necessary to distinguish A) and B), and the terminological distinction between percent and percentage point serves that purpose. It is, in my opinion, very important to use these terms correctly as blurring the distinction between them opens up the possibility of very large misunderstandings.
2 likes
Comment

Announcement