Checking normality for Panel Data

Nicolaas Bos

Join Date: Mar 2017

Posts: 2
#1

Checking normality for Panel Data

20 Mar 2017, 11:55

Hi Guys,

Probably an elementary question, but the guide on Laerd statistics provides no explanation and so does google. Hope you can help!

I'm testing 138 mergers for pre/post merger effects using 5 different variables, with 3 different benchmarks (unadjusted, industry adjusted and peer adjusted).

Even though a graph box shows a decent distribution and Levene's test says I have equally distributed variances I have problems with my normal distribution.

Using the Shapiro Wilcoxon test, my P values are <.05 and so on I conclude there is no normal distribution. However, when I leave away "noties" most of my P values rise to the point that there is a normal distribution.

3 Q's:
Should I only look at the P values?
When I look at W values (which should be high for normality), what should be considered the benchmark for normality?
My Laerd guide tells me t use the " noties" command when using swilk, how should I interpret this command and what is the reason for using int?

Thanks!

Nicolaas
Msc student from Rotterdam
Tags: None
Nicolaas Bos

Join Date: Mar 2017

Posts: 2
#2

20 Mar 2017, 11:57

Using "noties" this is my table
Attached Files
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35432
#3

20 Mar 2017, 12:31

This is very difficult to follow. We do advise that screenshots are not a good idea. Did you read the FAQ Advice before posting as requested?

http://www.statalist.org/forums/help#stata asks

In particular, please do not post screenshots. Many members will not be able to read them at all; they usually can't be read easily; and they do not allow copy and paste of data or code, which is highly desirable to allow experienced members to make precise suggestions for your questions.

I can just make out that you are applying Shapiro-Wilk tests for normality (no Wilcoxon here).

Laerd statistics: I guess that this is an allusion to https://statistics.laerd.com/ which seems to require sign-up and/or payment, so I didn't look further.

Google provides no explanation: this is a strong assertion. You don't believe that anything on the internet is helpful?

To the point: What worries you here is that Shapiro-Wilk often rejects the hypothesis of non-normality. You don't explain why this surprises you.

What I can suggest is this exercise for the auto data. This dataset has 74 observations, and so seems pertinent to you, as your sample sizes appear similar. You can run this code for yourself.

Code:

sysuse auto foreach v of var price-foreign { quietly swilk `v' . di "`v'{col 16}" %5.4f r(p) } price 0.0000 mpg 0.0043 rep78 0.4176 headroom 0.3314 trunk 0.2621 weight 0.0226 length 0.0946 turn 0.0880 displacement 0.0003 gear_ratio 0.0153 foreign 0.0684

I just fired a shotgun. I pushed all the numeric variables through swilk.

What we can infer from the P-values alone? Not much, reliably. It's best to look at normal quantile plots too. Here is how to get a slide show:

Code:

foreach v of var price-foreign { qnorm `v' more }

Among other things we note:

price gives the strongest rejection of normality. That makes sense: it's a skewed distribution.

rep78 gives the P-value that is highest. But it's a 5-level ordered variable. It's an accident that it appears normal at all.

For those slaves to P = 0.05 as making their decisions for them, note that foreign gives a P-value that doesn't look too bad. But it's an indicator variable! It can't possibly be normal.

So, what's the moral?

Shotguns go off in all directions.

What we can infer from the P-values alone? Not much, reliably.

In any mix of variables, it should be no surprise that some are non-normal.

swilk isn't a reliable guide. You have to look at graphs too, and think about each variable..
Comment

Announcement

Checking normality for Panel Data

Comment

Comment