Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Checking normality for Panel Data

    Hi Guys,

    Probably an elementary question, but the guide on Laerd statistics provides no explanation and so does google. Hope you can help!

    I'm testing 138 mergers for pre/post merger effects using 5 different variables, with 3 different benchmarks (unadjusted, industry adjusted and peer adjusted).

    Even though a graph box shows a decent distribution and Levene's test says I have equally distributed variances I have problems with my normal distribution.

    Using the Shapiro Wilcoxon test, my P values are <.05 and so on I conclude there is no normal distribution. However, when I leave away "noties" most of my P values rise to the point that there is a normal distribution.

    3 Q's:
    Should I only look at the P values?
    When I look at W values (which should be high for normality), what should be considered the benchmark for normality?
    My Laerd guide tells me t use the " noties" command when using swilk, how should I interpret this command and what is the reason for using int?

    Thanks!

    Nicolaas
    Msc student from Rotterdam

  • #2
    Using "noties" this is my table
    Attached Files

    Comment


    • #3
      This is very difficult to follow. We do advise that screenshots are not a good idea. Did you read the FAQ Advice before posting as requested?

      http://www.statalist.org/forums/help#stata asks

      In particular, please do not post screenshots. Many members will not be able to read them at all; they usually can't be read easily; and they do not allow copy and paste of data or code, which is highly desirable to allow experienced members to make precise suggestions for your questions.
      I can just make out that you are applying Shapiro-Wilk tests for normality (no Wilcoxon here).

      Laerd statistics: I guess that this is an allusion to https://statistics.laerd.com/ which seems to require sign-up and/or payment, so I didn't look further.

      Google provides no explanation: this is a strong assertion. You don't believe that anything on the internet is helpful?

      To the point: What worries you here is that Shapiro-Wilk often rejects the hypothesis of non-normality. You don't explain why this surprises you.

      What I can suggest is this exercise for the auto data. This dataset has 74 observations, and so seems pertinent to you, as your sample sizes appear similar. You can run this code for yourself.

      Code:
      sysuse auto 
      
      foreach v of var price-foreign {
          quietly swilk `v'
        . di "`v'{col 16}" %5.4f r(p)
      }
      
      price          0.0000
      mpg            0.0043
      rep78          0.4176
      headroom       0.3314
      trunk          0.2621
      weight         0.0226
      length         0.0946
      turn           0.0880
      displacement   0.0003
      gear_ratio     0.0153
      foreign        0.0684
      I just fired a shotgun. I pushed all the numeric variables through swilk.

      What we can infer from the P-values alone? Not much, reliably. It's best to look at normal quantile plots too. Here is how to get a slide show:

      Code:
      foreach v of var price-foreign {
          qnorm `v'
          more 
      }
      Among other things we note:

      price gives the strongest rejection of normality. That makes sense: it's a skewed distribution.

      rep78 gives the P-value that is highest. But it's a 5-level ordered variable. It's an accident that it appears normal at all.

      For those slaves to P = 0.05 as making their decisions for them, note that foreign gives a P-value that doesn't look too bad. But it's an indicator variable! It can't possibly be normal.

      So, what's the moral?

      Shotguns go off in all directions.

      What we can infer from the P-values alone? Not much, reliably.

      In any mix of variables, it should be no surprise that some are non-normal.

      swilk isn't a reliable guide. You have to look at graphs too, and think about each variable..


      Comment

      Working...
      X