Can I transform a dependent variable?

Iqbal Chowdhury

Join Date: May 2023

Posts: 33
#1

Can I transform a dependent variable?

04 Apr 2024, 12:10

Hello Clyde Schechter, Bruce Weaver, and Leonardo Guizzetti,

Can you please help me with the following query?

I am running an OLS model with a scale dependent variable, depression scale constructed from 20 items. It looks like the scale is not normally distributed. So, I am wondering if I can transform and make it a bit closer to a normally distributed condition. Can I log transfer?

Your suggestions will help me a lot to deal with the issue.

Thank you in advance for your suggestions.

Iqbal
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

04 Apr 2024, 12:51

Well, as long as the scale has no zero or negative values you can. But why do that? It is a widespread myth that the dependent variable in an OLS model needs to be normally distributed. This myth arises out of confusion with a theorem that says that if the residuals are normally distributed, then the t-statistics one calculates after OLS regression actually have t-distributions and yield correct inference. But even this is generally not a reason to try to normalize anything (even the residuals): due to the central limit theorem, if the sample is large enough, the usual calculations will yield correct inferences regardless of the residual distribution. In short, normality is of no practical importance in OLS regression unless you are working with a small data sample.

So normality of the residuals will be relevant, but only if your sample is small. How small is small? As a practical matter, if you have a sample size of 30 or more you are usually quite safe ignoring normality altogether. Consider, too, that if your sample size is less than that, your analysis is probably going to be underpowered for detecting anything other than effect sizes so large that they are already widely known relationships, and you will also have very little power for testing normality itself!
3 likes
Comment
Iqbal Chowdhury

Join Date: May 2023

Posts: 33
#3

04 Apr 2024, 14:24

Thank you so much, Clyde Schechter for your value worthy comment.
The variable has negativly schewed (-1.09) and very Kurtosis (4.00). That's why, I was thinking to transfer the variable. What I did, I might not be correct, is first reflected the variables and then log transfer. It looks like work. Do you think, it is accceptable to do that. You can see the document suggesting that in the attached file.
Thank you once again.

Iqbal

NegSkew.pdf

https://core.ecu.edu
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1109
#4

04 Apr 2024, 14:52

Adding to Clyde Schechter's excellent answer, the key assumptions are about the errors, not the dependent variable. And in order to assess how well those assumptions about the errors are met, you need to estimate the model you want to estimate and save the residuals! The following example from the good folks at UCLA will give you some ideas about how to do this.

Code:

* Source: https://stats.oarc.ucla.edu/stata/webbooks/reg/chapter2/stata-webbooksregressionwith-statachapter-2-regression-diagnostics/ * 2.2 Checking Normality of Residuals clear use https://stats.idre.ucla.edu/stat/stata/webbooks/reg/elemapi2 regress api00 meals ell emer predict r, resid kdensity r, normal graph rename kdplot, replace pnorm r graph rename pnormplot, replace qnorm r graph rename qnormplot, replace * 2.3 Checking Homoscedasticity of Residuals rvfplot, yline(0)

PS- The distinction I made between errors and residuals was deliberate. This Wikipedia page explains the distinction quite nicely, I think.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
2 likes
Comment

Announcement

Can I transform a dependent variable?

Comment

Comment

Comment