Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • log Transformations, How to Handle Negative Data Values?

    Hello there,

    I am wanting to run a quantile regression to examine gender differences in earnings. My dependent variable is wages, particularly the hourly wages. Thus, I am using the log of this variable, but my data has negative values, this is there exist some observations where the quantity of hour worked per week is greater than weekly wages. I have read some blogs they suggest to use a constant, but others don't recommend it. Please, you can help some idea about it.

    Thanks,



  • #2
    1. If you're intending to perform quantile regression, why are you doing any transformation of the response variable?

    2. What does it mean to have negative values for hourly wages? The employee pays the employer to work?

    Comment


    • #3
      Thus, I am using the log of this variable, but my data has negative values
      I don't think your data has negative values, unless someone is earning negative wages or working negative hours, The hourly wages will be positive - the ratio of two positive numbers - so you can take the log of hourly wages. The fact that the log of hourly wages has negative values is not important.

      We are warned about using logs when you need to take the log of a negative number, because that is not possible. Since your hourly wages will not be negative, thhis is is not a problem for you.

      Comment


      • #4
        Hello,

        Thanks for replied me. I am so sorry, have explained my question in bad way. I have a depedent variable called "hourlywages" and this variable is postitve, then I need to transform this variable in logarithm to obtain log_hourlywages, on this last variable is that I get negative values, so my question is how I can handle it if I need to run a regression. Replying to Joseph I have found when we need to run a regression with wages like dependent variable is better to use the log transformation, it is wrong?

        Bellow you can see the statistics of my variables.
        variable obs Mean Std.dev Minimum Maximum
        hourlywages 5681 13.06197 13.63631 .0083333 416.6667
        log_hourlywages 5681 2.283423 .7444923 -4.787492 6.032287
        Thanks again,

        Comment


        • #5
          Sofia:
          usually the reason underlying log transformation of the regressand (while keeping the predictors in their non-logged metric) is to explain in percentage terms the contribution to variation of the regressand produced by each predictor (when adjusted for the other ones).
          You cannot do anything to deal with negative logged values: they are simply a matter of fact.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Negative values in the unlogged variable would be problematic (not to mention nonsensical; you have to pay somebody to work for them?)

            But, it is fine for the logged variable to be negative. Suppose someone makes $1 an hour. Then, ln(1) = 0.

            Now, suppose they make only 10 cents an hour. Then, ln(.10) is -2.3025851.

            Are you sure your data are clean? That poor person who makes .0083333 an hour has to work 120 hours to make $1! (Or 1 of whatever unit wages are being measured in.)

            Still unclear is how and why you are using logged values with quantile regression.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              There is nothing wrong statistically with #4. Negative logarithms necessarily and unproblematically arise when the argument is less than 1. That's what a logarithm means....

              The problem lies upstream in that hourly wages of 0.008333 in whatever currency units look puzzling, although it's more than is paid to people who answer questions on Statalist.

              Comment


              • #8
                Thanks for your comments. I was worried about these negative values.

                Well, data come from the national survey of a developing country, maybe is a reason why they are so low, but. I will review data to know if I may find a mistake. About why I am using quantile regression with log wage, I have only reviewd some papers and they use log wage in a quantile regression, I know this is not a excuse, but I have not an answer for your comment Richard, and I didn“t know what it is not possible.

                Comment

                Working...
                X