Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Logarithmic form of dependent variable in Panel Data

    Any help would be much appreciated,
    I am currently investigating panel data where the dependent variable is a %share of the market, the range of values across all panels is 0.04 - 40 (approx) , due to outliers and a right-skewed distribution I am attempting to use a log of the market share which is consistent with the work in the literature.
    However with many values below 1 this creates a large number of negative log values, is there a way around this and how would it be implemented, I have seen the use of a constant ( ln(Y+1) ) but this then means none of my variables are significant.
    I plan to use a fixed effects model with time fixed effects and clustered standard erros.
    Would my results still be valid/appropriate without the using the logarithmic form of the dependent variable ? Is there a way around this issue ?
    Thanks
    Last edited by Ollie Beedle; 28 Apr 2022, 08:49.

  • #2
    Ollie:
    log transformation would worsen the negative skewness of your regressand.
    I would stick with the original metric and double-check wether outliers are legal observations due to the data generating process or are the offspring of a mistaken data entry.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Thank you Carlo,
      My apologies but realised I had meant to say right-skewed, I'll correct the original post. The outliers are correct data points mainly due to a large increase in market share in the last 2 years in the data set (total 6 years) in some countries (total 28 countries) .

      Comment


      • #4
        Ollie:
        -xtreg,fe vce(cluster panelid)- with a logged regressand may be the way to go, then.
        If you go log-linear, the interpretation of the cariation in the regressand in % terms.
        I would stay away from ( ln(Y+1) ) corrections.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Okay thank you Carlo,
          Lastly, just to clarify, is it an issue if there are logged regressand values which are negative ? Would this invalidate the coeffecients/p-values given for the regressors ?

          Comment


          • #6
            Ollie:
            no, because you have to exponentiate the coefficients to obtain their contributions (when adjusted for the other predictor) to variations in the regressand:
            Code:
            [exp(<coefficient>)-1]
            as in the following toy-example:
            Code:
            .  di exp((-0.24)-1)
            .28938422 ///or 28.94%
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Negative logarithms — indicating values below 1 — are not a problem any more than percents below 1 are a problem.

              Comment

              Working...
              X