Dear Statalists!
I got a bit confused about the assumptions/requierments of the t-test and think here are some very good experts.
As far as I know, the "unpaired" two-sample t-test requires an approx. normal distribution to investigate the difference in the sample means. In my concrete situation, this assumption does not hold and the data in both samples are - even with a relatively large sample (n=100) - more than a little bit skewed.
Now I read a lot about the assumption and some state that it does not hold for the "unpaired" t-test. The authors refer to various simulations that there is no significant difference in the power (compared to a non-parametric test such as Wilcoxon-rank sum test) and the t-statistic gets with a large number even normally distributed (central-limit-t.?).
If I apply the non-parametric test I get significantly different results that would change the complete story.
My questions: Is it statistically acceptable to apply the unpaired t-test with skewed data and a relatively large dataset?
I got a bit confused about the assumptions/requierments of the t-test and think here are some very good experts.
As far as I know, the "unpaired" two-sample t-test requires an approx. normal distribution to investigate the difference in the sample means. In my concrete situation, this assumption does not hold and the data in both samples are - even with a relatively large sample (n=100) - more than a little bit skewed.
Now I read a lot about the assumption and some state that it does not hold for the "unpaired" t-test. The authors refer to various simulations that there is no significant difference in the power (compared to a non-parametric test such as Wilcoxon-rank sum test) and the t-statistic gets with a large number even normally distributed (central-limit-t.?).
If I apply the non-parametric test I get significantly different results that would change the complete story.
My questions: Is it statistically acceptable to apply the unpaired t-test with skewed data and a relatively large dataset?
- If yes, are there good papers that provide more information regarding the rationale/arguments behind this effect?
- If not, is there a way to (test) estimate the power of the current t-test compared to a non-parametric (or any other)?
Comment