Interpretation of log-log and log level coefficients

John Dent

Join Date: Apr 2023

Posts: 2
#1

Interpretation of log-log and log level coefficients

19 Apr 2023, 12:30

Hi,

I am currently in the process of writing my undergraduate dissertation and I am having issues in the interpretation of my coefficients.

I am investigating the linkage between immigration and house prices in England. My dependent variable is the log of average house prices in each local authority identified and my independent variable is net migration as a proportion of the population of the local authority in that year.

When performing a random effects regression as log-level I yield a significant coefficient of 4.991 which would mean a 1 unit increase in migration leads to a 499.1% increase in house prices? If I do this as a log-log I yield a coefficient of 0.009 (1%) which would be a more suitable result but my dependent variable of migration is as a proportion between -1 and 1 so, I am unsure how to interpret this as I am effectively logging a percentage?

Any help would be much appreciated.

Many thanks,
John
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

19 Apr 2023, 13:01

Forget about log-transforming your migration variable. You can't take logarithms of negative numbers. And, particularly when the natural range of the variable is between -1 and 1, the kludge of using log(1+x) seriously distorts things.

In the analysis of log housing price and untransformed immigration, there are several issues with what you are saying. First, you are using causal language, which is not appropriate for describing results of observational studies. You should say instead that a unit difference in migration is associated with an difference of (some amount) in housing prices. Next, what is the amount. The rule that the coefficient translates into a percentage is an approximation, and one that only works well with small coefficients. Here's the right way to do it:

If migration differs by 1 unit, then the expected value of log housing price differs by 4.991 units. In equations:
log housing_price_1 = log housing_price_0 + 4.991.

Exponentiating both sides of this equation:
housing_price_1 = housing_price_0 * exp(4.991)

Now, exp(4.991) is approximately 147. So housing_price_1 = 147*housing_price_0. This is a 14,600% increase. Frankly, although I know next to nothing about housing prices, this does not seem plausible to me, and I suggest you carefully review both your modeling and the correctness of your data set.
Comment
John Dent

Join Date: Apr 2023

Posts: 2
#3

20 Apr 2023, 09:14

Thank you for your reply!

I understand there must be clear issues with my data and think I need to go back to the drawing board slightly.

I did have one question, relating to your interpretation of my coefficient. I wondered about the reasoning behind taking the exponential of both sides of the equations as I thought multiplying the Beta1 by 100 would yield a percentage that we could then use to interpret.

Many thanks,
John
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#4

20 Apr 2023, 09:28

I wondered about the reasoning behind taking the exponential of both sides of the equations...

Exponentiation is how you get from log-transformed variables back to untransformed ones. More formally, exp() is the inverse function of log(). And, of course, whatever you do to one side of an equation you must also do to the other.

...as I thought multiplying the Beta1 by 100 would yield a percentage that we could then use to interpret.

Multiplying beta1 by 100 gives a useful percentage when beta1 is a small number. But it's only an approximation to the correct difference, and it's only a good approximation when beta1 is small in magnitude (meaning, for practical purposes, less than 0.1). The larger beta1 is, the worse the approximation to the correct difference.

You can see this if you look at the mathematics in a bit more depth--calculus required here. We have seen in #2 is that coefficient beta1 corresponds to a multiplicative effect of exp(beta1). Now, the exponential function has a Taylor series that converges everywhere. exp(beta1) = 1 + beta1 + (beta1)^2/2! + (beta1)^3/3! + ... If beta1 is small enough, the terms in (beta1)^2 and higher powers rapidly become very small and can be, to a reasonable approximation, ignored. So, for small beta1, exp(beta1) is approximately equal to 1 + beta1. So that's why the effect is approximately equal to 100*beta percentage increase. And the closer to zero beta1 is, the better the approximation.

But when beta1 is not small, the higher order terms are not negligible. And when, as in your case, it is bigger than 1, these terms start to blow up (for a while until the factorial in the denominator overwhelms them) and their contributions are not only not negligible, but they dominate the result.
Comment
George Ford

Join Date: Aug 2014

Posts: 3120
#5

20 Apr 2023, 12:46

I agree with Clyde--seems a bit large. Something ain't right.

Also try this to make sure you are interpreting correctly.

margins, at(immig= (-1(0.1)1.0)) expression(exp(predict(xb))*exp((`e(rmse)'^2)/2))
Comment

Announcement

Interpretation of log-log and log level coefficients

Comment

Comment

Comment

Comment