Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • When to generate the variable into (natural) logarithm?

    I just remembered my teacher saying that I need to generate the numerical variables (like NPL, NOW, LLR and LLP) into (natural) logarithm because there are also percentage (%) variables (like CH_NPL, CH_LOANS) in my data.

    But I'm not sure if it is natural logarithm (ln) or just logarithm (log).

    Also I'm not sure about the command in whichever case is applied for me.

    Please help it's really urgent. Thank you!
    Click image for larger version

Name:	attach.png
Views:	1
Size:	66.7 KB
ID:	1714882

    Last edited by Minh Minh; 25 May 2023, 02:59.

  • #2
    Minh:
    poorly structured questions get equvalent replies, unfortunately (please, see the FAQ on how posting more effective on these forums. Thanks).
    That said, -ln- is usually the way to go (provided that -ln- transformatin comes with issues concerning zero and negative values and back-transformation on the raw scale).
    I assume that, by quoting percentage, you mean somethung like log-linear and/or log_log regressions.
    As an aside, please note that invoking urgency does not put your query on any fast lane (all in all, all requests of help are urgent, otherwise the posters would avoid to send them out to Statalist).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      The choice between natural logarithms, logs to base 10, and indeed logs to base 2 or any other base should hinge on what is done in the literature of your field. The more mathematical any field is, or its practitioners are, the more they tend to use natural logarithms. That's my impression. Other way round, if exp() is familiar notation in what you read, then ln() is the way to go.

      Just be consistent and make explicit what you do.

      I echo @Carlo Lazzaro's comments otherwise.

      Comment


      • #4
        In computer science and genomics log base 2 is common.

        Comment


        • #5
          I will just note that in Stata, -log(x)- and -ln(x)- are the same thing. If a base 10 logarithm is wanted, you would code that as -log10(x)-. There is no Stata function for base 2 logarithms, though it can be calculated as:
          Code:
          gen log_base_2(x) = log(x)/log(2)

          Comment


          • #6
            On pp.265-266 of https://journals.sagepub.com/doi/pdf...867X1801800116 I quote Paul R. Halmos being snarky about the notation ln on the grounds that for any mathematician (so, not me) there could be only one serious definition of logarithm. More historical trivia can be found at the same place.

            Here is one very practical use for log10. You want a histogram of a logged variable but to show axis labels on the original scale. If you use log10 then it may be easy to work out in your head an option call such as

            Code:
            xla(-1 "0.1" 0 "1" 1 "10" 2 "100" 3 "1000")
            or whatever, much of the point of the example that doing this usually is most needed when a variable spans several orders of magnitude.

            Comment


            • #7
              how can i generate a natural log of a variable such as fdi inflow

              Comment


              • #8
                #7

                Code:
                gen   ln_inflow = ln(inflow) 
                
                tab inflow if missing(ln_inflow)
                The second command is to check for stray zeros or negative values in the original.

                Comment

                Working...
                X