Exponential Regression: nl vs. reg

Kevin Park

Join Date: Aug 2014

Posts: 4
#1

Exponential Regression: nl vs. reg

01 Jun 2015, 21:54

I would like to fit an exponential decay function: y=A*exp(b*x)

I thought the best method would be to use the "nl" command, as in:
nl (y={A}*(exp({b}*x)))

But an alternative method should be to take the log of y first and the run a simple linear regression.
Because
ln(y)=ln(A*exp(b*x))
ln(y)=ln(A)+ln(exp(b*x))
ln(y)=ln(A)+b*x

Therefore, using
gen ln_y=ln(y)
reg ln_y x

should get me the same result.

But I don't get the same estimates of "b" under both approaches.

I am not very familiar with the "nl" command. So maybe I am not using it correctly?

Thanks
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2066
#2

01 Jun 2015, 22:58

Nothing says you should get the same result. The log of the expectation is not the expectation of the log. You need to start with a model that includes an error term. Say, a multiplicative error term in the exponential model. If that error is independent of x then the two methods are both consistent for b. But they won't be the same. Without independence they have different probably limits. You need to decide what you're wiling to assume.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29581
#3

01 Jun 2015, 22:58

The two models are not equivalent because you have overlooked the role of error terms. The actual equation that -nl- estimates is

y = {A} * exp({b}*x) + e

And -nl- finds the values of A and b that minimize the sum of e^2.

Crucially, the error term is additive. So when you take logarithms, you don't get

ln_y = ln(A) + b*x + e. You get ln_y = ln(A*exp(b*x)+e), which does not simplify in closed form.

The model you are estimating using -reg ln_y x- would be equivalent to the original y = {A} * exp({b}*x) * e, with a multiplicative error.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2940
#4

02 Jun 2015, 05:40

Dear Kevin,

There are few discussions in which I would have anything to add after Jeff and Clyde contributed, but I guess this is one of those very rare cases.

As Jeff pointed out, the two methods will generally lead to different estimates, which in general do not even have the same probability limit. So, you need to decide whether you are interested to learn about the effects of x on (the conditional mean of) y or on (the conditional mean of) ln(y).

In case you want to learn about the conditional expectation of y given x, you have to estimate the model in its multiplicative form. However, using -nl- may not be the best option because there is ample evidence that the non-linear least squares estimator can be very inefficient in this context. A much safer approach it to use Poisson regression with robust standard errors, as advocated here (see also here). Whether estimating a model in logs or in levels makes a material difference is very much an empirical question, but there are well-known cases where the difference can be substantial.

Finally, Clyde's discussion of the two models is not entirely correct. Indeed, what -nl- estimates is

y = {A}*exp({b}*x) + e

but this model can also be written as a model with a multiplicative error because

y = {A}*exp({b}*x) + e = {A}*exp({b}*x)*u

with u = 1 + e/({A}*exp({b}*x)).

So, your model in logs will be ln_y = ln(A) + b*x + ln(u). The problem is that in general the conditional expectation of ln(u) is not a constant, and therefore OLS is likely to be inconsistent for b. The paper I mentioned above contains a detailed discussion of this problem.

All the best,

Joao
Comment
Kevin Park

Join Date: Aug 2014

Posts: 4
#5

04 Jun 2015, 13:40

Thank you Jeff, Clyde and Joao.

I am working with population density gradients. My intuition had been to model the effect of x (distance) on the conditional mean of y (population density). But it seems the standard practice in my field is to model ln(y) and that I wrote the initial equation wrong.
This paper (page 16) writes the formula as equivalent to y={A}*exp({b}x*e) with the error within the exponential term, so that
y={A}*exp({b}x*e)
ln(y)=ln{A} + ln(exp({b}x)) + ln(exp(e))
ln(y)=ln{A} + {b}x + e

This is just for the summary statistics part of my paper, so I would prefer not delve into a mathematical debate on this issue with the existing literature, but do you see any problems with that formulation?

Thanks again.
Kevin
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2940
#6

04 Jun 2015, 15:28

Dear Kevin,

The problem with that formulation is that A and b are parameters of the conditional expectation of ln(y) given x, but in general are not parameters of the conditional expectation of y given x. Suppose e is heteroskedastic; then E[y|x] = {A}*exp({b}x) * E[exp(e)|x], which is not equal to {A}*exp({b}x) because E[exp(e)|x] is a function of x.

As Jeff pointed out above, you need to decide which of the two conditional expectations is of interest. In case you really care about E[y|x], then the model in logs may be misleading.

All the best,

Joao
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2066
#7

04 Jun 2015, 22:45

Kevin: My 1992 IER paper, "Some Alternatives to the Box-Cox Regression Model," also discusses what Joao has highlighted above: that a model of a conditional mean for a nonnegative response can be written with an additive or multiplicative error. The models are equivalent unless one imposes some extra assumption, such as the error is independent of the covariates. I would prefer the exponential model estimating using the Poisson or gamma quasi-MLEs.
Comment
samwoka mutanda

Join Date: Aug 2016

Posts: 1
#8

29 Aug 2016, 12:10

hi. am new to stata and so kindly forgive my naivety. am trying to run an exponential fit and i cant figure out the code for the same. kindly help. thanks
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4393
#9

29 Aug 2016, 12:13

if you look at the help for "nl" you will see several example of "common" models including the exponential; if this is not what you are looking for, please read the FAQ and then clarify your question
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29581
#10

29 Aug 2016, 13:51

Another possibility is if you are trying to do is fit an exponential time to failure model. In that case, -help streg-.
Comment

Announcement

Exponential Regression: nl vs. reg

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment