Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • signrank includes pairs with zero difference in the test statistic. Why?

    Hi,

    Pairs with zero difference are usually excluded when the Wilcoxon matched-pairs signed-ranks test statistic is calculated, but the Stata implementation of this test (signrank) includes them and adjusts the variance of the statistic instead (STB reference from 1995). These approaches will in general not lead to identical P-values if zeros are present. Which method is preferable and why?

    I noticed the problem when a colleague of mine tried to replicate a Stata 14.1-analysis performed by me in SPSS version 23.

    The number of pairs in the analysis was 31 and only 11 of the differences were non-zero.

    The P-value in SPSS was .119 compared to .278 in Stata.

    The P-value will be .119 also in Stata if the pairs with zero difference are dropped before I run signrank.

    Comments on that?

    Pär-Ola Bendahl



  • #2
    Consider the limiting case in which all differences are zero.

    Would you drop them all and say that the data contain no information that is relevant?

    Comment


    • #3
      Thanks Nick,

      Good argument to consider the limiting case.

      I agree, the Stata version of the Mann-Whitney test for paired samples makes better sense than the standard test described in text books and implemented in for example SPSS and R.

      The tests answer slightly different questions. The standard implementation is conditioning on a difference whereas the Stata implementation is not.

      Slightly problematic though that there is no conensus regarding the definition of the MW-test for paired samples. With access to the original data and the statistics section of a paper, all results should be possible to reproduce, Hence, it is not sufficient to state that the Wilcoxon matched-pairs signed-ranks test was used to test for differences before and after treatment.

      /Pär-Ola

      Comment


      • #4
        You can take a look at the Methods and formulas section of user's manual entry for signrank for the explanation of why Stata handles ties as it does and why it's valid. And you can always check it out empirically if you're not satisfied. See below.

        The test size is a wash: 0.049 versus 0.048 at a nominal 0.05.

        The number of ties on average is somewhat lower than what you have in your particular dataset (one-third versus two-thirds tied), but the proportion of ties doesn't make any difference (see LOWESS plot).

        .ÿversionÿ14.1

        .ÿ
        .ÿclearÿ*

        .ÿsetÿmoreÿoff

        .ÿsetÿseedÿ`=date("2016-02-02",ÿ"YMD")'

        .ÿ
        .ÿprogramÿdefineÿtestem,ÿrclass
        ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ14.1
        ÿÿ2.ÿÿÿÿÿÿÿÿÿsyntax
        ÿÿ3.ÿ
        .ÿÿÿÿÿÿÿÿÿdropÿ_all
        ÿÿ4.ÿÿÿÿÿÿÿÿÿquietlyÿsetÿobsÿ31
        ÿÿ5.ÿÿÿÿÿÿÿÿÿranintÿleftÿright,ÿa(1)ÿb(3)
        ÿÿ6.ÿ
        .ÿÿÿÿÿÿÿÿÿsignrankÿleftÿ=ÿright
        ÿÿ7.ÿÿÿÿÿÿÿÿÿtempnameÿz_tieÿties
        ÿÿ8.ÿÿÿÿÿÿÿÿÿscalarÿdefineÿ`z_tie'ÿ=ÿr(z)
        ÿÿ9.ÿÿÿÿÿÿÿÿÿscalarÿdefineÿ`ties'ÿ=ÿr(N_tie)
        ÿ10.ÿ
        .ÿÿÿÿÿÿÿÿÿdropÿifÿleftÿ==ÿright
        ÿ11.ÿÿÿÿÿÿÿÿÿsignrankÿleftÿ=ÿright
        ÿ12.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿz_untieÿ=ÿr(z)
        ÿ13.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿz_tieÿ=ÿ`z_tie'
        ÿ14.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿtiesÿ=ÿ`ties'
        ÿ15.ÿend

        .ÿ
        .ÿsimulateÿz_untieÿ=ÿr(z_untie)ÿz_tieÿ=ÿr(z_tie)ÿtiesÿ=ÿr(ties),ÿreps(10000)ÿnodots:ÿtestem

        ÿÿÿÿÿÿcommand:ÿÿtestem
        ÿÿÿÿÿÿz_untie:ÿÿr(z_untie)
        ÿÿÿÿÿÿÿÿz_tie:ÿÿr(z_tie)
        ÿÿÿÿÿÿÿÿÿties:ÿÿr(ties)


        .ÿ
        .ÿforeachÿmethodÿinÿtieÿuntieÿ{
        ÿÿ2.ÿÿÿÿÿÿÿÿÿgenerateÿbyteÿpos_`method'ÿ=ÿ2ÿ*ÿnormal(-abs(z_`method'))ÿ<ÿ0.05
        ÿÿ3.ÿ}

        .ÿformatÿpos_*ÿ%05.3f

        .ÿsummarizeÿpos_*ÿties,ÿformat

        ÿÿÿÿVariableÿ|ÿÿÿÿÿÿÿÿObsÿÿÿÿÿÿÿÿMeanÿÿÿÿStd.ÿDev.ÿÿÿÿÿÿÿMinÿÿÿÿÿÿÿÿMax
        -------------+---------------------------------------------------------
        ÿÿÿÿÿpos_tieÿ|ÿÿÿÿÿ10,000ÿÿÿÿÿÿÿ0.049ÿÿÿÿÿÿÿ0.217ÿÿÿÿÿÿ0.000ÿÿÿÿÿÿ1.000
        ÿÿÿpos_untieÿ|ÿÿÿÿÿ10,000ÿÿÿÿÿÿÿ0.048ÿÿÿÿÿÿÿ0.213ÿÿÿÿÿÿ0.000ÿÿÿÿÿÿ1.000
        ÿÿÿÿÿÿÿÿtiesÿ|ÿÿÿÿÿ10,000ÿÿÿÿÿ10.3372ÿÿÿÿ2.621409ÿÿÿÿÿÿÿÿÿÿ1ÿÿÿÿÿÿÿÿÿ21

        .ÿ
        .ÿgraphÿtwowayÿscatterÿz_untieÿz_tie,ÿmsize(small)ÿmcolor(black)ÿ||ÿ///
        >ÿÿÿÿÿÿÿÿÿlineÿz_tieÿz_tie,ÿsortÿlpattern(dash)ÿlcolor(white)ÿ///
        >ÿÿÿÿÿÿÿÿÿylabel(ÿ,ÿangle(horizontal)ÿnogrid)ÿlegend(off)

        .ÿquietlyÿgraphÿexportÿscatter.png

        .ÿ
        .ÿgenerateÿdoubleÿdeltaÿ=ÿz_tieÿ-ÿz_untie

        .ÿgenerateÿdoubleÿavgÿ=ÿ(z_tieÿ+ÿz_untie)ÿ/ÿ2

        .ÿsummarizeÿdelta,ÿmeanonly

        .ÿgraphÿtwowayÿscatterÿdeltaÿavg,ÿmcolor(black)ÿmsize(small)ÿ///
        >ÿÿÿÿÿÿÿÿÿyline(`=r(mean)',ÿlcolor(black)ÿlpattern(dash))ÿylabel(ÿ,ÿangle(horizontal)ÿnogrid)

        .ÿquietlyÿgraphÿexportÿba.png

        .ÿ
        .ÿlowessÿdeltaÿties,ÿmcolor(black)ÿmsize(small)ÿlineopts(lcolor(black)ÿlpattern(dash))ÿ///
        >ÿÿÿÿÿÿÿÿÿylabel(,ÿangle(horizontal)ÿnogrid)ÿytitle(tiedÿzÿÿuntiedÿz)ÿxtitle(NumberÿofÿTies)

        .ÿquietlyÿgraphÿexportÿzties.png

        .ÿ
        .ÿexit

        endÿofÿdo-file


        .

        Click image for larger version

Name:	zties.png
Views:	1
Size:	18.9 KB
ID:	1325168 Click image for larger version

Name:	scatter.png
Views:	1
Size:	11.1 KB
ID:	1325169 Click image for larger version

Name:	ba.png
Views:	1
Size:	24.7 KB
ID:	1325170
        Attached Files

        Comment


        • #5
          Thanks Joseph,

          Your simulations indicate, contrary to what I excepted, that the Z-statistics, with and without cases with zero-differences excluded, have the same distribution under the null hypothesis. Furthermore, the difference seems to be independet of the number of ties observed. So, as far as I understand, these simulations indicate that the way of handling tied observations might matter in a particular case, but one method does not give systematically lower P-values than the other. Correct interpretation?

          Thanks for digging into this and for sharing your code from which I learned a lot.

          /Pär-Ola

          Comment


          • #6
            Once, this was also a concern of mine. I believe theory (with regards to use - or not - ties in the estimation) is the crucial issue here, not exactly the statistical package. Naturally, differences may rise up on account of default options. However, we still get quite similar results in the 3 packages (Stata, SPSS and R), provided the options are equivalent. We may change the default in SPSS, for example, and demand an "exact" estimation. Under R, even if we don't select the "exact" option, and according to R's help files, "by default (if exact is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used". We may get similar results between R and Stata for default options. This notwithstanding, we may also "tell" R not to correct for ties, and the results will be similar to the ones found in SPSS under default estimation.

            Hopefully that helps.

            Best,

            Marcos
            Best regards,

            Marcos

            Comment


            • #7
              Yes, those would be my interpretations, as well.

              As far a using exact distribution of the test statistic for sample sizes under 50, it seems that with as few as 31 observations the test size is well maintained even with the normal approximation (0.049 for 0.05).

              Comment


              • #8
                Thanks Nick, Joseph and Marcos for your helpful comments.

                Comment


                • #9
                  No systematic difference between the Z-statistics with and without ties was seen under the null hypothesis.

                  By adding 1 to all the left values in Josephs code above (post #4) and re-running the code, considerable differences between the two Z-statistics were observed under the alternative hypothesis of a one unit score difference:

                  Click image for larger version

Name:	combo.png
Views:	1
Size:	14.5 KB
ID:	1325304


                  This leaves me with some worry.

                  Comment


                  • #10
                    Stata's method of retaining tied values does seem to have very slightly lower power than SPSS's / R's method of dropping them. Power for both methods drops with increased proportion of ties, Stata's method perhaps very slightly more so.

                    The do-file and results are too long to display in the body of the post and so are attached.
                    Attached Files

                    Comment

                    Working...
                    X