I have lots of zeros in both my dependent and independent variables.
One way that I was dealing with this is by adding 1 to all of the values. However, this makes each of the variables right-skewed. So I took the natural log to create a normal distribution. But when I do this I get a spike to the left followed by a normal distribution (see an example below). As I believe this violates the assumption of normal distribution, I tried dropping the zeros which reduces the sample size too much and then I don't get significance in my models. I read that I could impute the zero values with the mean, but I know that would misrepresent my data. I also read that I could take the square root instead of the log for transformation, but the data is still right skewed rather than having a normal distraction. Any other thoughts on how I might deal with this issue would be much appreciated!
One way that I was dealing with this is by adding 1 to all of the values. However, this makes each of the variables right-skewed. So I took the natural log to create a normal distribution. But when I do this I get a spike to the left followed by a normal distribution (see an example below). As I believe this violates the assumption of normal distribution, I tried dropping the zeros which reduces the sample size too much and then I don't get significance in my models. I read that I could impute the zero values with the mean, but I know that would misrepresent my data. I also read that I could take the square root instead of the log for transformation, but the data is still right skewed rather than having a normal distraction. Any other thoughts on how I might deal with this issue would be much appreciated!
Comment