Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Normalizing variable which has negative values

    Hello

    I am trying to "normalize" a variable which has negative values, which entails obtaining values smaller than 3 for both the skewness and the kurtosis of the resulting distribution.

    If I simply use the log function I loose a lot of cases, which I would like to avoid. I also tried many other options (using the gladder command, among other options), including the most "popular" one which is adding the a constant "a" (with "a" being the absolute of the biggest negative value of the variable), and then doing the log as follows: generate ln_variable = ln(variable+a). However, when I do this, the skewness of ln_variable is a large negative number and its kurtosis is a very large positive value. Does anybody know of a transformation which I can use in order to normalise my variable (i.e. of a transformation which does not entail getting rid of negative values and which also keeps skewness and kurtosis smaller than 3)? Any help you can provide would be much appreciated.
    Kind regards,

    Joao

  • #2
    Why do you want to do this? For example, not much in statistics assumes a marginal normal distribution. Even if does, then (moment-based) skewness ~ 3 is not a good approximation, so lose-lose.

    Do zeros have intrinsic meaning? If so, then log(variable + constant) is a bad idea as not only is the choice of constant a dark art, if based on the observed minimum it is utterly arbitrary and destroys that meaning.

    Are you minimum and maximum attainable values?

    Cube roots and asinh() can help. But you have to ensure that the cube rooting is done correctly.

    Perhaps most important: why not show us a graph, say using histogram or quantile?

    Comment


    • #3
      One sentence needed to be: Are there minimum and maximum attainable values?

      Comment


      • #4
        Dear Nick

        Thank you very much for your feedback. There are no minimum and maximum attainable values for "var". The zeros do not have any intrinsic meaning. I have tried asinh and cube root but did not attain Skewness and Kurtosis equal or below 3 (absolute value).

        Please find attached the histogram for the variable of interest. Thank you very much in advance!

        Best,
        Joao

        Last edited by Joao Sa Oliveira; 13 Feb 2020, 12:38.

        Comment


        • #5
          I can’t comment on the histogram. Please note the request to post graphs as .png.

          As said, skewness of 3 is a long way from normal, or any other kind of symmetry.

          Comment


          • #6
            Dear Nick


            Please see below:

            Click image for larger version

Name:	Histogram.png
Views:	1
Size:	35.5 KB
ID:	1536495


            Best,
            J
            Last edited by Joao Sa Oliveira; 13 Feb 2020, 12:37.

            Comment


            • #7
              You can use asinh(x/k) but k can at best be a fudge factor.

              Comment


              • #8
                Dear Nick,

                Thank you very much for your feedback. I used asinh(x/k) and it worked. Skewness was reduced to a value close to zero and kurtosis to a value smaller than 3.

                Kind regards,
                Joao

                Comment

                Working...
                X