Normality test

Belinda Foster

Join Date: Jul 2016

Posts: 132
#1

Normality test

08 Feb 2017, 11:05

Hello.

I have a continuous variable (say var1) and three treatments. I was wondering what is the most appropriate method of testing this variable for normality. I am asking because i want to decide whether to use a parametric or non parametric test that tests for differences in var1 across the three treatments.

After consulting the manual, I think for what i want to do, the command mvtest normality is the way forward. However, is it okay to use a generic syntax such as:

Code:

mvtest normality var1

or is it necessary to test by treatment level:

Code:

by treatment: mvtest normality var1

Last edited by Belinda Foster; 08 Feb 2017, 11:08.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17660
#2

08 Feb 2017, 11:56

Belinda:
you can side-track all your efforts and go -regress-, instead (that requires normality for residuals distribution only):

Code:

regress var1 i.treatment

See -fvvarlist- about creating categorical variables and interactions (but I'd guess you already know it).

Kind regards,
Carlo
(StataNow 18.5)
Comment
Paul T Seed

Join Date: Apr 2014

Posts: 66
#3

08 Feb 2017, 12:23

In general, it is better not to use significance tests to decide which method of analysis to use. This is because you are not interested in whether your assumptions can be demonstrated to be true , but whether the approximaitons are so badly out as to make the analysis invalid. For small samples, the tests often lack power and a non-significant result cannot be relied on, For large samples, they are significant for trivial differences. Better to look at the actual distributions, and plan accordingly.

When checking for Normality, I generally use the -qnorm- plot to look for serious violations of Normality, backed up by -ladder- or -gladder- to check for an appropriate transformation. In biochemistry, (and many other situations), the usual answer is to take logs, carry out the tests and modelling needed, and then back-transform the differences on the log scale to give ratios of the geometric means. Ratios can be converted to fold changes if wanted.

If (as often happens with biochemical analysis) there are a large number of "zero" values. representing concentrations too small to detect, I generally use interval regression (command -intreg-), and represent these value as between - infinity (shown as missing: .) and the lowest detected non-zero value. Throwing away all the low values is an obvious way to introduce bias.

A typical analysis might go:

Code:

qnorm var1 * NOTE: close enough. No transformation needed regress var1 i.group qnorm var2 ladder var2 * NOTE: logs needed gen l_var2 = ln(var2) regress var2 i.group , eform(Ratio_GM) qnorm var3 count if var3 == 0 * NOTE: problem with zeros ladder var2 if var3 > 0 * NOTE: logs needed gen l_var2_l = ln(var2) * zeros become missing (.), meaning - infinity gen l_var2_h = ln(var2) su l_var2_l, meanonly replace l_var2_h = r(min) if var2 == 0 * zeros are replaced by the smallest non-zero value intreg l_var2_l l_var2_h i.group * NOTE: intreg does not accept the eform() option, so a calculator is needed.
1 like
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1115
#4

08 Feb 2017, 12:27

Belinda, as Carlo has noted in #2, it is the errors that are assumed to be normal.

I would go further and add that normality of the errors is far less important than independence and homoscedasticity of the errors. And as n increases, normality of the errors becomes less and less important. Therefore, I would not use a statistical test of normality: It will be under-powered when n is small (and normality of the errors is more important), and over-powered when n is large (and normality of the errors is not terribly important). It is generally better, IMO, to use graphical methods to assess the normality of the errors. You'll find relevant examples on this UCLA web-page.

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
1 like
Comment
Paul T Seed

Join Date: Apr 2014

Posts: 66
#5

08 Feb 2017, 12:36

To pick up Carlo Lazzaro's point:
It is certainly true that for tests and models based on the Normal distribution, it is strictly the distribution of the residuals that matters, not the distribution of the outcome variable. But in most situations, the difference between the distributions is too small for this to matter. This applies equally to linear regression, analysis of variance and t-tests; so using -regress- onstead of -ttest- or -anova- is not a solution.

Using regress with the cluster() option does reduce the problem by adjusting the size of the standard errors; but that is a different matter.
Comment
Belinda Foster

Join Date: Jul 2016

Posts: 132
#6

08 Feb 2017, 12:51

Thank you all for your replies. However, I am not talking about normality of residuals. I am referring to tests that help get a sense of the data at hand before any analysis takes place. Now, i am aware that normality tests are far from an ideal method but when i have a large number of continuous variables it is simply impractical to examine them all graphically. I need to narrow down the number of variables. So unless i am missing something, a normality test is another way to do this. It would be great if someone could answer my original questions.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35377

08 Feb 2017, 13:05

I don't understand why graphical analysis is impractical while normality tests supposedly are practical. It's a loop over variables in each case. And the graphical analysis can show outliers, skewness, kurtosis, granularity etc. which may well be hidden or implicit in the tests. As others have emphasised, the test results are often artefacts of sample size in any case.

In the case of the auto data, for example, I can see all 11 graphs at once with

Code:

* do this once
* ssc inst combineplot

combineplot price-foreign, qnorm @y

Also, there is scope for measuring non-normality, as witness

Code:

. lmoments, short allobs

-------------------------------------------------------------------------------
              Variable |          n        l_1        l_2        t_3        t_4
-----------------------+-------------------------------------------------------
                 Price |         74   6165.257   1451.338      0.426      0.212
         Mileage (mpg) |         74     21.297      3.177      0.175      0.149
    Repair Record 1978 |         69      3.406      0.538      0.030      0.127
        Headroom (in.) |         74      2.993      0.482      0.031      0.048
 Trunk space (cu. ft.) |         74     13.757      2.453     -0.005      0.060
         Weight (lbs.) |         74   3019.459    447.023      0.025      0.031
          Length (in.) |         74    187.932     12.837     -0.014      0.038
     Turn Circle (ft.) |         74     39.649      2.512      0.011      0.029
Displacement (cu. in.) |         74    197.297     51.723      0.158      0.030
            Gear Ratio |         74      3.015      0.261      0.059      0.065
              Car type |         74      0.297      0.212      0.417     -0.048
-------------------------------------------------------------------------------

. moments, allobs

-------------------------------------------------------------------------------
              Variable |          n       mean         SD   skewness   kurtosis
-----------------------+-------------------------------------------------------
                 Price |         74   6165.257   2949.496      1.653      4.819
         Mileage (mpg) |         74     21.297      5.786      0.949      3.975
    Repair Record 1978 |         69      3.406      0.990     -0.057      2.678
        Headroom (in.) |         74      2.993      0.846      0.141      2.208
 Trunk space (cu. ft.) |         74     13.757      4.277      0.029      2.192
         Weight (lbs.) |         74   3019.459    777.194      0.148      2.118
          Length (in.) |         74    187.932     22.266     -0.041      2.042
     Turn Circle (ft.) |         74     39.649      4.399      0.124      2.229
Displacement (cu. in.) |         74    197.297     91.837      0.592      2.376
            Gear Ratio |         74      3.015      0.456      0.219      2.102
              Car type |         74      0.297      0.460      0.887      1.787
-------------------------------------------------------------------------------

moments and lmoments are from SSC.

Somewhat hilariously, foreign has high skewness as a side-effect of its low mean, but no useful transformation is possible for an indicator variable and no transformation is needed in any case.

Last edited by Nick Cox; 08 Feb 2017, 13:08.

Comment

Belinda Foster

Join Date: Jul 2016

Posts: 132
#8

08 Feb 2017, 13:33

Nick, I agree that graphical analysis is always better than a normality test but that's not the point here. It would be useful to know the answer to my original question even if it is purely for encyclopaedic reasons.

By the way, do you suggest that instead of relying on a normality test, it is best if one checks normality using moments? I am aware that one can look at the skewness of a variable and decide whether it is normal or not but what is an appropriate cut-off point? I think the consensus is that above an absolute value of 0.5 the distribution is not normal. In the example i illustrated above, do i have to check skewness by treatment group or an overall measure of skewness is enough?

Last edited by Belinda Foster; 08 Feb 2017, 13:40.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35377
#9

08 Feb 2017, 14:06

I have no easy news for you, and indeed no news for you. Nothing is sufficient - test, graph, measure - but being careful to learn about the data is necessary for a defensible analysis. Arbitrary cut-offs for e.g. skewness can be no better advised than adhering to P < 0.05.

Your original question, I think, was whether to test groupwise or overall. Why shouldn't both be useful for their own questions?
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#10

08 Feb 2017, 14:20

(ex

Originally posted by Belinda Foster View Post

Thank you all for your replies. However, I am not talking about normality of residuals. I am referring to tests that help get a sense of the data at hand before any analysis takes place. Now, i am aware that normality tests are far from an ideal method but when i have a large number of continuous variables it is simply impractical to examine them all graphically. I need to narrow down the number of variables. So unless i am missing something, a normality test is another way to do this. It would be great if someone could answer my original questions.

I realize this isn't what you asked, but normality of residuals is indirectly related to normality of the outcome variable.

When we regress, we are assuming that:

Y = XB + error

Where you can just think of XB as representing the intercept, and the beta and the values of one independent variable. More formally, XB is a vector that represents multiple betas and multiple independent variables.

The XBs are fixed for each person in the data. When you subtract each person's value of XB from their Y, you're left with just the random error term. So yes, all the responses about normality of the error distribution are actually related to your question (albeit not directly).

But, because you asked, mvtest normality looks reasonable. Your first syntax is fine, although you can add the

Code:

, univariate

option, which may speed it up if you're running just one variable. If you want to test by treatment group, the by prefix should work. However, do you need to? Since the treatment is categorical, if the mean of Y is normal overall, then you should be OK.

It sounds like you have a bunch of other continuous variables in there (post #6). If those are independent variables, then I don't think you need to test those for normality as well (it sounded like you might be planning to), because again, it's the normality of the error distribution that matters the most, which is sort of related to the normality of your outcome variable.

Also, note the available test options. Not sure how useful they are, but if you want to test for (excess??) skewness, you can do so. If you're going to use one of these, it's probably best if you can explain roughly what they're doing in plain English; looking at the math there, I certainly can't.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Belinda Foster

Join Date: Jul 2016

Posts: 132
#11

08 Feb 2017, 14:23

Nick, i have already provided the reason for asking:

I am asking because i want to decide whether to use a parametric or non parametric test that tests for differences in var1 across the three treatments.

What do you think?

Thanks, this has been a useful discussion.
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1115
#12

08 Feb 2017, 15:12

Belinda, re #1 and #11, it sounds like you are trying to decide between one-way ANOVA and the Kruskal-Wallis test. Is that right?

Most respondents so far (including me) are saying that using the result from a statistical test of normality is not a good way to make that choice. However, if you are going to go ahead, I think you need to look at normality by treatment levels. Why? Because as stated earlier, the normality assumption for OLS models applies to the errors. And for one-way ANOVA, residual = raw score - group mean. I.e., normality within groups is what is assumed. (Normality of the DV overall would only be assumed if there is absolutely no treatment effect--i.e., if all population means were equal.)

Alternatively, following Carlo's lead, fit the model, save the residuals, and test the normality of the residuals. E.g.,

Code:

quietly anova var1 treatment predict double resid, residuals mvtest normality resid

Please don't take this as an endorsement of testing for normality as a precursor to ANOVA. I am on record as saying that is generally a bad idea. ;-)

Finally, regarding the use of rank-based tests as tests of location, bear in mind that they are only tests of location under what some authors call the "pure shift" model--i.e., when the population shapes are identical, and differ (if at all) only in location. That situation is arguably rare in practice. See the nice simulation study by Fagerland & Sandvik for more information.

HTH.

Cheers,
Bruce

Last edited by Bruce Weaver; 08 Feb 2017, 15:26.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment