Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ksmirnov and chitest

    Hi everyone,
    I am looking for a sound method to test whether a distribution is significantly different from a discrete uniform one. Basically I have a small sample of the outcomes of an unfair die roll and I need to "prove" that the die is unfair.
    I tried two methods. The first was
    Code:
    ksmirnov mydata = runiformint(1, 6)
    which gave
    Code:
    One-sample Kolmogorov-Smirnov test against theoretical distribution
               runiformint(1, 6)
    
     Smaller group       D       P-value  
     -----------------------------------
     mydata:            -0.3500    0.007
     Cumulative:        -5.8000    0.000
     Combined K-S:       5.8000    0.000
    while the second one was
    Code:
     chitest mydata, count sep(0)
    which gave
    Code:
    observed frequencies of mydata; expected frequencies equal
    
             Pearson chi2(5) =   2.2000   Pr =  0.821
    likelihood-ratio chi2(5) =   2.2530   Pr =  0.813
    
      +-------------------------------------------------------------+
      | mydata    observed   expected   notes   obs - exp   Pearson |
      |-------------------------------------------------------------|
      |       1          4      3.333   *           0.667     0.365 |
      |       2          2      3.333   *          -1.333    -0.730 |
      |       3          2      3.333   *          -1.333    -0.730 |
      |       4          3      3.333   *          -0.333    -0.183 |
      |       5          5      3.333   *           1.667     0.913 |
      |       6          4      3.333   *           0.667     0.365 |
      +-------------------------------------------------------------+
    
    *  1 <= expected < 5
    So basically it seems that the result depends on the method. Do you have any ideas about why this happens and/or better solutions?

    Thank you

  • #2
    Different methods will indeed give different results as K-S looks at the cumulative distribution function while a chi-square test ignores the ordering in the data.

    A more basic problem is that you haven't presented ksmirnov with a theoretical cumulative (distribution function) varying from 0 to 1.

    If I understand correctly, the cumulative for your problem is mydata/6

    Further, specifying a random number function for your cumulative would mean results hard to reproduce and even then dependent slightly on seed.

    I tend to prefer chi-square tests here, possibly a case of familiarity rather than statistical optimality. One good reason is that you can look at residuals easily.



    Note: chitest is from tab_chi on SSC (FAQ Advice #12).
    Last edited by Nick Cox; 17 Feb 2022, 05:30.

    Comment


    • #3
      Thank you very much for your help

      Comment

      Working...
      X