Log transformation of negative profits. Solutions?

Kevin Pineda-Hernandez

Join Date: Apr 2020

Posts: 4
#1

Log transformation of negative profits. Solutions?

25 Jul 2023, 15:06

Hi all,

I want to use the logarithm of a firm's profits as a dependent variable (see data example). However, I have many firms with negative values, which implies that I would get many missing values after the log transformation. I know that one potential solution is to include a constant. However, I have firms with huge negative profits. Thus, I do not see it as feasible. Do you have other solutions, or am I obliged to pay the price of using a log transformation?

Thanks in advance for your help.

Best,
Kevin

Code:

firm year profits_heures 3 2002 -10.93949 3 2002 -10.93949 3 2002 -10.93949 3 2002 -10.93949 8 2006 .279665 8 2006 .279665 8 2006 .279665 8 2006 .279665 8 2007 .9127844 8 2007 .9127844 8 2007 .9127844 8 2007 .9127844 8 2007 .9127844 8 2008 -2.7366899 8 2008 -2.7366899 8 2008 -2.7366899 8 2008 -2.7366899 8 2008 -2.7366899 8 2008 -2.7366899 8 2010 -.54344109 8 2010 -.54344109 8 2010 -.54344109 8 2010 -.54344109 end
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

25 Jul 2023, 15:13

Why do you want to use a log transformation for this variable? What purpose to you hope to accomplish by doing that. Whatever the purpose, if it is legitimate at all, there is surely a better way to achieve it--it is hard to imagine a variable less suitable for log transformation. So say what that is, and perhaps a suitable solution will be found.
2 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35432
#3

26 Jul 2023, 02:35

Clyde Schechter asks a fair question. What to do here is much debated, on Statalist and more generally, and it sometimes seems that nobody much likes anybody else's favoured solutions. Here are some:

0. Leaving your outcome untransformed. Only trying it will show what virtues and vices this has.

1. If you think your mean function is positive, that is, the mean outcome as a function of predictors, which could be plausible if negative values -- even though sometimes very large negative -- are in a small minority, then generalized linear models with a logarithmic link might work adequately.

2. The so-called neglog transformation T(y) = sign(y) * log(1 + abs(y)) has the merits that

it preserves sign, as T(y) is negative, zero, or positive exactly as y is negative, zero or positive

it behaves like y for y near 0, like log y for y >> 0 and like -log(-y) for y << 0

it is likely to reduce problems with appreciable skewness, tail weight or outliers.

However, it does not find universal favour as appearing arbitrary to critics (but "fit for purpose" otherwise).

3. T(y) = asinh(y) or more generally asinh(k y) has some family resemblance to #2 and comments tend to be similar. (In some literature, it is known as IHS, an abbreviation likely to bemuse or puzzle those who know other uses for that abbreviation or contraction, but IHS is made intelligible as inverse hyperbolic sine.) Note that choice of k > 0 is crucial and defaulting to k = 1 is also a choice.

Further comments on #2 and #3. Although getting some desired shape for the marginal distribution of T(y), or that of y, is not at all first priority in choosing a method, it's not irrelevant either. I would always

* plot T(y) versus y for the outcome data to get a sense of whether it behaves sensibly

* look especially carefully at residuals from any model predicting T(y) from your X variables.

* invert predictions using the inverse function.

4. A two-part model predicting profit or loss as binary and magnitude of profit or loss as a non-negative outcome. No experience with this myself, but it's a well-used model in some fields. Even simpler is the possibility that profit and loss define subsets which deserve, or even demand, quite different models.

5. I've left until last log(y + c) where c is large enough to make all logarithms positive. I rate this easily the worst solution as ad hoc in the worst sense. A plot of log(y + c) versus y is again essential to see what the transform does, and it's often deeply unsatisfactory.

Simple to state, but harder to satisfy, are that a good approach not only "works" with your data but allows relating your results to those of other studies in a relevant literature. The chicken and egg question is that you would often need to re-do other studies using a particular method to be able to compare. "Do what others have done" is mixed counsel, as the persistence of #5 as a suggestion to me implies that many researchers are not thinking hard enough about what they have done. In particular, plotting the transformation and the results are often neglected steps, especially, it seems, in some branches of economics.
3 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35432
#4

26 Jul 2023, 07:20

I should add -- before say John Mullahy does -- that 1 in sign(y) * log(1 + abs(y)) is not an innocent neutral. If the units are say million USD 1 means something quite different from what it means if the units are USD. And so on.
4 likes
Comment
John Mullahy

Join Date: Dec 2016

Posts: 742
#5

26 Jul 2023, 12:27

It seems Nick Cox knows where I stand on such matters.

(I should emphasize that my comment here is meant to be a generic one, not one focused specifically on the issue raised in #1 of the present thread.)

Last edited by John Mullahy; 26 Jul 2023, 12:34.
3 likes
Comment

Announcement

Log transformation of negative profits. Solutions?

Comment

Comment

Comment

Comment