Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Growth rates as lag-scaled difference or as difference in logs?

    Dear all,

    I have what should be an easy question, yet I have not so far managed to solve it:

    I want to compute and use in regressions quarter-on-quarter or year-on-year growth rates, but I am unsure whether to do this with
    Code:
    gen growth = 100*D.volume/L.volume
    or with
    Code:
    gen ln = ln(volume)
    gen growth2 = 100*D.ln
    The latter yields a more sensible mean, but yields a bit over 5% of values < -100% which does not seem sensible to me economically.
    Which would you use, and if the latter would you winsorize or trim, and at exactly -100% or at P5?

    Thank you so much!
    PM

  • #2
    Code:
    g growth = ln(volume) - ln(l.volume)
    is this a panel? if so, you need to bys id. I think xtset may do that for you if you've done that.
    Last edited by George Ford; 20 Mar 2024, 12:53.

    Comment


    • #3
      Great that works (and is more erfficient than my 2 line code) thanks! I had xtset it indeed.

      Still get growth rates < -100%, which I guess I have to deal with through winsorization, but other than that the growth rates do at least look sensible, thank you!

      Comment


      • #4
        Did you look at the data that produced those changes to see if there's a problem?

        Comment


        • #5
          Thank you, great idea!

          I have for example one case where volume = 73 and lagged volume = 741.

          Without using logs this looks to me like a fall by 668 and hence a growth rate of about -90%.
          But ln(73)-ln(741) suggests a growth rate of about -231%.
          Is this just because the log difference approximation is more off at such large growth rates, or did I make a mistake here?

          Thanks so much!

          Comment


          • #6
            The non-linearity of the log function means the differences will get much larger (relative to the simple %change) as the change get bigger.

            is 73 way out of range for that cross section? may be an error. are such big changes common/possible for the outcome?

            growth rates can be computed in a variety of ways. you could try the arc formula if you want to reduce extremes, just be sure to say what you did. Those big changes may act as outliers and distort your regression line.

            Comment


            • #7
              Yes, a reduction of the volume from a level of 741 to a level of 73 is totally plausible here.

              By ARC, I assume you mean the the Average Rate of Compound Growth, i.e. (volume/L.volume-1)?
              That is in fact a very useful suggestion for another time when I'm interested in growth over multiple periods.
              In my current case, where I'm interested in growth since the last quarter only, that would yield the same result as percentage growth, (volume-L.volume)/L.volume.
              Which, in contrast to the log difference approximation, is bottom-bounded at -100%, but as a price seems to have higher outliers at the top and hence an exaggerated mean.

              Regression results in any case seem not too different whether I use percentage changes or the log difference, as long as winsorize both variables at P5 and P95,
              so if there is no better third alternative, I will probably just use the log difference approximation.

              Comment


              • #8
                Sorry, as the presumed general formula for ARC I of course meant
                (volume/L.volume)^(1/n) - 1
                where n would be the number of periods and hence reduce to 1 only in my case.

                Comment


                • #9
                  (y1 - y0) / (0.5 * (y1 + y0))

                  Comment


                  • #10
                    Thank you!

                    That still generates a distribution of growth rates with lots of extreme values (up to -200%) left of P5 as well as (up to +200%) right of P95,
                    but for some reason using that instead of the log linear approximation or the percentage change makes many results slightly more precise.

                    So it's good to have learnt all three variations now, so I can report all three and describe the pros and cons in terms of the resulting distributions of each!

                    Comment


                    • #11
                      might try winsoring to see what that does.

                      Comment

                      Working...
                      X