Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can I transform a dependent variable?

    Hello Clyde Schechter, Bruce Weaver, and Leonardo Guizzetti,

    Can you please help me with the following query?

    I am running an OLS model with a scale dependent variable, depression scale constructed from 20 items. It looks like the scale is not normally distributed. So, I am wondering if I can transform and make it a bit closer to a normally distributed condition. Can I log transfer?

    Your suggestions will help me a lot to deal with the issue.

    Thank you in advance for your suggestions.

    Iqbal

  • #2
    Well, as long as the scale has no zero or negative values you can. But why do that? It is a widespread myth that the dependent variable in an OLS model needs to be normally distributed. This myth arises out of confusion with a theorem that says that if the residuals are normally distributed, then the t-statistics one calculates after OLS regression actually have t-distributions and yield correct inference. But even this is generally not a reason to try to normalize anything (even the residuals): due to the central limit theorem, if the sample is large enough, the usual calculations will yield correct inferences regardless of the residual distribution. In short, normality is of no practical importance in OLS regression unless you are working with a small data sample.

    So normality of the residuals will be relevant, but only if your sample is small. How small is small? As a practical matter, if you have a sample size of 30 or more you are usually quite safe ignoring normality altogether. Consider, too, that if your sample size is less than that, your analysis is probably going to be underpowered for detecting anything other than effect sizes so large that they are already widely known relationships, and you will also have very little power for testing normality itself!

    Comment


    • #3
      Thank you so much, Clyde Schechter for your value worthy comment.
      The variable has negativly schewed (-1.09) and very Kurtosis (4.00). That's why, I was thinking to transfer the variable. What I did, I might not be correct, is first reflected the variables and then log transfer. It looks like work. Do you think, it is accceptable to do that. You can see the document suggesting that in the attached file.
      Thank you once again.

      Iqbal

      NegSkew.pdf

      Comment


      • #4
        Adding to Clyde Schechter's excellent answer, the key assumptions are about the errors, not the dependent variable. And in order to assess how well those assumptions about the errors are met, you need to estimate the model you want to estimate and save the residuals! The following example from the good folks at UCLA will give you some ideas about how to do this.

        Code:
        * Source:  https://stats.oarc.ucla.edu/stata/webbooks/reg/chapter2/stata-webbooksregressionwith-statachapter-2-regression-diagnostics/
        
        * 2.2 Checking Normality of Residuals
        clear
        use https://stats.idre.ucla.edu/stat/stata/webbooks/reg/elemapi2
        regress api00 meals ell emer
        
        predict r, resid
        
        kdensity r, normal
        graph rename kdplot, replace
        
        pnorm r
        graph rename pnormplot, replace
        
        qnorm r
        graph rename qnormplot, replace
        
        * 2.3 Checking Homoscedasticity of Residuals
        rvfplot, yline(0)
        PS- The distinction I made between errors and residuals was deliberate. This Wikipedia page explains the distinction quite nicely, I think.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment

        Working...
        X