Log-modulus (neglog) transformation

eligius belmont

Join Date: Nov 2024

Posts: 7
#1

Log-modulus (neglog) transformation

19 Nov 2024, 14:39

Dear members,

I have a question regarding interpreting fixed effect OLS (FEOLS) estimates for a log-modulus (neglog) transformed dependent variable.

For my thesis, I employ a large panel data set and my general estimation strategy follows usual FEOLS as in:
Y_ijt = X_ijt + a_ij + b_t + e_ijt
whereby Y is my dependent variable, X refers to a vector of independent variables, a are the fixed effects for ij (basically a territory dyad), and b the fixed effects for years, and e the error term.

Now, since my dependent variable has many extreme values both below and above 0, and because (in the best case) I would like to interpret percentage (or at least relative) changes, I have transformed it via a log-modulus (or also called neglog) transformation in this way:
Y^b = sign(Y) * ln(|Y|)
There are no 0s in Y and most values are way above |10000|. I have attached a histogram of Y^b.

I have learned that if I had a ln-transformed independent variable and a ln-transformed dependent variable, I can interpret the estimate as approximate percentage change.
There is not much out there on log-modulus transformed variables in regression analyses but as far as I understood, this interpretation is not possible anymore, since:
a negative value could come from an original positive value (between 0 and 1) or an original negative value, thus it is arbritrary,

a change in shift of polarity (from below 0 to above 0) may not be interpreted anymore as percentage change (and I believe even regular percentage changes of shifts from negative to positive numbers are rather difficult).

However, since I don't have values between -1 and 1, I guess I don't have to worry about the first point.
Regarding the second point: If I were to select only dyads (ij groups) from my data set (while being aware of selection bias here) which are polarity-fixed (meaning not changing signs throughout time), could I then interpret an estimate of a ln-transformed X on my Y^b again as approximate percentage change? Because what I would be left with in Y^b is just kind of a mean of both ln-transformed values and ln-transformed values times -1 (adequately representing an oppositely directed effect)?

I wondered if year fixed effects would create difficulties (since they include both polarities) but then I am thinking I could build the same model by including year dummies, and clearly the estimate would not be bothered by it and just reflect the value under the condition of netting out year effects?

Can anyone help me out, if there is a substantial misinterpretation from my side or if you would agree with this line of arguing? Can I interpret the estimate as approximate percentage change for the case described?
I feel at least slightly adrift with this issue...

Would appreciate any help I can get! (:
Attached Files

Last edited by eligius belmont; 19 Nov 2024, 14:58.
Tags: fixed effects regression, log, loglinear, neglog, panel
Joao Santos Silva

Join Date: Apr 2014

Posts: 2982
#2

19 Nov 2024, 22:19

Dear eligius belmont,

It would help to know what is the variable that you are modelling, but my first reaction is that it is not a good idea to use this transformation. Besides the usual Jensen's inequality issue, it seems that you are trying to estimate a constant-elasticity model for a kind of that where that would not make sense.

Best wishes,

Joao
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35349
#3

20 Nov 2024, 00:48

Handling outcomes that can be negative, zero or positive can be even trickier than handling outcomes that are can be zero or positive, where in particular it is obvious that a logarithmic transformation that might appeal on other grounds cannot cope with the zeros, and you need something else whether

1. a different transformation, which divides opinion sharply

2. what is in generalized linear model jargon a logarithmic link, exploiting the fact that conditional means are expected to be positive

3. a two-part model, modelling first the occurrence of zeros or positive values as a binary outcome, and then the positive outcomes

4. something else that has escaped my attention.

I am more positive (pun intended) about this transformation than Joao Santos Silva -- or John Mullahy who often posts in this territory -- especially in the service of visualization where (quite literally) you can see if it is a bad idea OR if the variable concerned is a predictor (although again visualization is a vital check).

As I understand it neglog could be a link function for a generalized linear model, but I can't recall anyone coding it up properly.

Losing a simple interpretation of percent change seems a minor loss to me.
2 likes
Comment
John Mullahy

Join Date: Dec 2016

Posts: 738
#4

20 Nov 2024, 05:41

The distribution depicted in the histogram is striking.

Do any of your X variables predict with certainty or with high probability whether a negative or positive outcome arises? E.g. female/male, urban/rural, etc.? If so and if the X's are plausibly exogenous then I would strongly consider stratifying on that X variable and conducting separate FEOLS analysis using the untransformed outcomes as the dependent variable. Of course only you will know whether such a stratified analysis is helpful for your research objectives.

As Joao Santos Silva noted in #2

It would help to know what is the variable that you are modelling

to which I would add that it would be helpful to know what X variables you are using.
2 likes
Comment
eligius belmont

Join Date: Nov 2024

Posts: 7
#5

20 Nov 2024, 09:01

Thanks very much for your comments and interest!

I thought it may be not as relevant since my questions are rather specific, but understand it can be helpful to grasp some more context in which my problem lays.
My dependent variable depicts unequal exchange in the sense of unrecorded value transfers and the calculation follows:
Y = Ratio of exchange rate deviation indeces * exports - exports
My independent variables range from binary variables (like trade agreements, currency union etc.), to numerical percentage variables (like trade coreness) to ln-transformed numerical variables (like ratio in wages). My main variable of interest is a count variable, namely IMF conditions, which I have log+1 transformed for the main model (again to estimate rather relative than absolute changes).

Generally, I am aware that the neglog-transformation is not a straightforward way for estimation and I did try to seek other options.

Joao Santos Silva The Jensen inequality issues also applies to usual log-log models, right? But in case of my variables of interest being count data above 1 (well and 0s) and Y being all >>1 (or << -1) may it not be acceptable to live with it?
Since the theory I am resting on is empirically rather untouched, I was orienting myself on trade gravity models, where it seems a standard to work with log-log models (or at least in the past).
My model also works roughly with untransformed data but, in my mind, it would be much more meaningful to be able to interpret relative changes, especially since the range of the raw dependent variable is so large.

@Nick Cox While, in principle, Y could have 0s in my case, there are none. I have thought of other transformations or also adjusting the calculation of the dependent variable but did not manage to find a satisfiying version. However, I could look again in the realm of logarithmic links, thanks for this input. A two-part model of estimating polarity (the signs as binaries) and then the outcomes as all positives could be an option, although I am wondering if ends up not being more complicated than a neglog-transformation (in both regards of empirically modeling and interpretation). In terms of visualization, what do you mean exactly? How would I know when it is a bad idea?

@John Mullahy The side of polarity depends on the ratio of exchange deviation indeces. Since this is a crucial part of the calculation of the dependent variable, it is not exogenous to it. I am not aware of a good realistic predictor of the ratio being above or below 1 (deciding the polarity).

On a different note, my model also roughly works with untransformed variables. But, as stated before, it would be more meaningful to interpret relative changes. Furthermore, also for an IV-version of it, the neglog transformation works quite well for my purposes.

I guess I am just coming back to the question if I should be concerned interpreting approximate percentage changes when having Y neglog-transformed and in the case of having no values of Y in between -1 and +1 and having no trade dyads switch across polarity?

Thank you very much for the help!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35349
#6

20 Nov 2024, 09:34

Some of my comments were intended generally and not (well) directed at your particular circumstances. As you have so many negative values, a log link couldn't be expected to work well.

As for visualization, I mean something like this.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(y1 y2) 0 0 1 1000 10 10000 100 100000 1000 1000000 end gen log1p_y1 = log1p(y1) gen log1p_y2 = log1p(y2) set scheme stcolor scatter log1p_y1 y1, name(G1) scatter log1p_y2 y2, name(G2) graph combine G1 G2

On the left log1p (log (y + 1)) works quite well; on the right not so well at all as although the right tail has been pulled in the lowest value of 0 is pushed out and is now a massive outlier.

You're contemplating a different transformation but the principle is the same

plot transformed values versus original values to see what a transformation does

It's not a deep point as whether a transformation is a good idea should leap out at you if it is.

Percentage change is a natural measure if change (variation) is essentially multiplicative, but not otherwise. There is not much scope for discussion there beyond the weasel word essentially. I can't see that it's relevant to your kind of data and what's not relevant is no loss.

Last edited by Nick Cox; 20 Nov 2024, 09:36.
1 like
Comment
eligius belmont

Join Date: Nov 2024

Posts: 7
#7

20 Nov 2024, 10:42

Thanks for the example!

I am attaching the transformation of my dependent variable (Ue_l = Y^b (neglog-transformed), Ue = Y (untransformed)). The plot on the right is constrained to values of Y between -1e05 & 1e05 to zoom in.

If I understand correctly, I can argue that my transformation works well, outliers are significantly compressed making regression analysis more robust and there is quite some symmetry on both poles. Thus, if I were to assume a rather multiplicative relationship and given no crossing of polarity within trade dyads, would you agree I was good to go with that transformation?

Last edited by eligius belmont; 20 Nov 2024, 10:44.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2982
#8

20 Nov 2024, 22:30

Dear eligius belmont,

With respect to #5, Jensen's inequality applies to any non-linear transformation: you estimate the expected value of the transformed data and that may not be informative about the expected value of the original data; with trade data, it rarely is.

As Nick Cox noted, a percentage change is not always a natural thing to look at. It is used a lot in economics because we often have in mind multiplicative models, such as the Cobb-Douglas production function or the gravity equation, that have constant (semi) elasticities; in models with different functional forms, these quantities depend on the regressors and therefore are less useful.

On a different topic, if you are worried about outliers, I suggest you use quantile regression rather than transforming the data.

Best wishes,

Joao
2 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35349
#9

21 Nov 2024, 03:48

The image in #7 isn't readable (by me). As explained in FAQ Advice #12 the forum software takes .png files but tends to clam up if fed any other graphics file format. It doesn't understand Stata .gph files for example. (The forum software was not written by StataCorp.)
Comment
eligius belmont

Join Date: Nov 2024

Posts: 7
#10

21 Nov 2024, 06:17

@Nick Cox sorry about the inconvenience! I am attaching the .png file this time, hoping you can see it.
Attached Files
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35349
#11

21 Nov 2024, 06:39

Thanks for the graph. My own take is that if you think you need a transformation, this one works as well as could be expected.

The transformation preserves sign. It stretches most (is steepest) around zero and squeezes most (is gentlest) for the largest absolute values. Cube root and asinh (which economists often call IHS, for inverse hyperbolic sine, which is a weird abbreviation for anyone familiar with its use in Christianity) have qualitatively similar behaviour. As is documented, with cube root, you need to arrange the behaviour you want with

Code:

sign(y) * (abs(y)^(1/3))

I don't think there is a best buy for your data that everyone with experience would sign off on. Depending on what the project is, how much time you have, and so forth, trying different analyses might be a good idea. I like Joao Santos Silva's idea of quantile regression too.
1 like
Comment
eligius belmont

Join Date: Nov 2024

Posts: 7
#12

22 Nov 2024, 07:37

@Nick Cox thanks for your take and the comparison to IHS (yes, I guess the ambiguity of abbreviations can be entertaining, indeed :D), I will look into this transformation.

As to quantile regression, I, too, think it could be an appropiate and valuable option for my case. Although I am also a bit unsure here, if it would be much more beneficial than doing two regressions for dyads on either side of 0. I will look into it!

@Joao Santos Silva Thank you for your comment (#8).
First of all, I want to apologize for my comment in #5

where it seems a standard to work with log-log models (or at least in the past)

if it came across in any affronting manner. I guess I was aware of at least the existence of the debate on log-transfomations in gravity models but did not recognize it was actually your contribtution in the first place! In hindsight, it feels slightly absurd to have written this to you.

I revisited respective papers. I think I have roughly understood the problem; at least in my mind there now seem to be two major flaws in log-log (or any non-linear transformed) least square appraoches:
Since Jensens inequality states E(ln y) != ln E(y), (percentage) interpretations of logged variabels are inconsistent with the actual relation of the original variables.

OLS estimates the average relationship, but for each observation, there is an implicit error term relative to the estimated mean This error term becomes dependent on the log-transformation because it relates to the other logged individual observations. This dependance may inherit heteroskedasticity as well as (therefore) a dependence on other logged independent variables. Hence, there is an implicit amplification of inconsistency (due to Jensens inequality) within the estimation process.

Would you agree with this intuitive understanding or am I missing a point somewhere?

It’s no exaggeration to say I feel disillusioned by log-linear least squares regression. Interesting to not have learned that (but that economics classes teach diametral to economic research (and the real world anyways) is nothing new I guess).

I also can not get my head around the fact that there so many publications still relying on it (especially in sociology but also economics)? I mean, there seems no remedy.
From this perspective, are there any valid approaches for still relying on log-linear least square analyses?
Stewart (2018) argues that in some instances, one may be actually interested in E(ln Y) and thus log-linearized equations may still be estimated by OLS. They give as an example the statistical earning function of labor economics. But even if so, I struggle to understand how: 1. This interpretation is economically meaningful when coefficients cannot consistently inform about actual income changes (except perhaps as in comparing the extend of different coefficients)? And 2. how the inconsistency caused by error dependence no longer plays a role?

Last edited by eligius belmont; 22 Nov 2024, 07:40.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2982
#13

22 Nov 2024, 12:08

Dear eligius belmont,

Thank you for your this. You are very kind but you have nothing to apologise for. Not only what you said was absolutely fine, but you do not have to be aware of my publications. Also, the discussion about transforming the dependent variable is much older that my work on it; e.g., John Mullahy, Jeff Wooldridge, and Arthur Goldberger made very important early contributions.

Anyway, I am not sure if I fully understand your interpretation, but one way to see it is that the regression in logs estimates the parameters of the conditional geometric mean, and this can be very different from the usual arithmetic mean we have in mind. When the data are strictly positive, arguably both can be interesting, but if you have observations equal to zero (or negative) then the geometric mean is not interesting.

Another way to look into it is to consider the case of log normal data. If we think of a multiplicative model with a multiplicative log-normal error term with mean 1, then the linearized model will have an error term that has a normal distribution with a mean that depends on the variance. So, the regression in logs conflates the effects of the regressors on the mean with their effects on the variance, and therefore will lead to inconsistent estimates of the elasticities unless we have homoskedasticity.

Finally, it is fair to say that in many cases (e.g., wage regressions) the two are too similar for people to care about the difference. That is one of the reasons the log transformation continues to be so popular.

Best wishes,

Joao
Comment
eligius belmont

Join Date: Nov 2024

Posts: 7
#14

26 Nov 2024, 10:06

@Joao Santos Silva Thank you very much for the two further perspectives.

So, do I then understand correctly that, besides the likely inconsistent estimation of a constant-elasticity, a log-linear as well as a log-log regression may very likely yield consistent estimates for the conditional geometric mean (even in the case of strong heteroskedasticity)?

Additionally, I have an interpretation question regarding the geometric mean:
Looking at a fixed effect OLS estimation with fixed effects as dummy variables, I understand how one can interpret the coefficient of strictly positive logged variables as a meaningful geometric mean, conditional to all control (including fixed effect dummy) variables. However, this conditional geometric mean (aka the estimate) could turn out to be negative. Yet, strictly speaking, a geometric mean can not be negative. How to think about it? Could you perhaps shine some light on my misunderstanding? Also, feel free to refer me to other resources where this may be discussed in detail - I am just an econometrics beginner, so I may very well be talking at cross purposes :/

I hope it is okay that I ask these follow-up questions and I very much appreciate any help on the topic.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2982
#15

26 Nov 2024, 12:04

Dear eligius belmont,

My bad; I did not fully explain it. The regression where the dependent variable is in logs estimates the parameters of a geometric mean, but the geometric mean is the exponential of the fitted values. I discuss that is this obscure paper and also, briefly, here.

Best wishes,

Joao
1 like
Comment

Announcement

Log-modulus (neglog) transformation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment