Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logarithmization of variables results in variable drop

    After I transformed the values of a variable into (natural) logarithmic values, many nr. of observation dropped.

    Code:
    generate ln_excess_returns=ln(excess_returns)
    Why did this happend? What happens when the data is being transformed into logarithmic values?
    I use STATA v14.1
    Thank you very much.

  • #2
    What happens when the data is being transformed into logarithmic values?
    Intentionally provocative (but no offense): You should know that if you decide transform your data. At least you should have a reason for doing so that is different from: "everyone else does it".

    My guess is that you have zero and/or negative values in excess_returns. The logarithm is only defined for strictly positive values, thus you end up with missing values - not with less observations! You are correct in stating that cases with missing values will be excluded from regression type models, but you did not state that you were running those. I will assume you want something like

    Code:
    regress ln_excess_returns idepvars
    If so, please read Bill Gould's blog entry on the matter.

    Best
    Daniel

    Comment


    • #3
      Daniel,
      thank you for the reply. The article was a great start. And yes you are right, I don't quite know what I'm doing... but learning

      Yes I want to run a multiple regression with "excess return" as dependent. According "codebook" this variable has negative values even after i "log" it:

      range: [-4.1195941,4.0271358]
      std. dev: 1.44402
      there are about 3600 values in total...

      Comment


      • #4
        Naturally. Any positive value less than 1 has a logarithm that is negative. That is expected, and not in itself a problem. Taking your lowest value, note that it corresponds to a return about 0.16:

        Code:
        . di exp(-4.119594)
        .01625111
        I'd revise logarithms before you make use of them.

        The bigger deal is zero and negative values in the original. Some people use sign(x) * ln(1 + abs(x)) as a transformation that behaves like ln(x) for large positive x and like -ln(-x) for large negative x. That doesn't look friendly until you get to know it.
        Last edited by Nick Cox; 28 Apr 2016, 02:44.

        Comment

        Working...
        X