signrank vs signtest

Ignacio Herrando

Join Date: Sep 2023

Posts: 10
#1

signrank vs signtest

12 Jan 2024, 05:49

Hi!
I am doing statistical analyses of my data and I am stuck on this:
I have paired data, which difference normality test rejects the null hypothesis of normality. Then I am confused on which test I should apply, sign test or sign rank?
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35224
#2

12 Jan 2024, 06:41

A transformation is likely to be an alternative. For more focused answers, please post a data example.
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2371
#3

12 Jan 2024, 08:38

Why do you think you need to test normality of the difference? You are likely to get a better answer if you can give more context for what you actually want to do.
3 likes
Comment

Ignacio Herrando

Join Date: Sep 2023
Posts: 10

12 Jan 2024, 12:16

This is an example of my dataset.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input double(N_CH2_tau2 T_CH2_tau2) byte Lymphocitesinfiltration
4.16362841156473 4.93538492231974 1
4.38470511471866 4.93289448950807 1
 3.9493157572978 4.04775540111395 1
4.47644020828451 4.75038269721282 1
4.12868005927907 4.77916901635508 0
4.31110733928542 4.12566781745431 1
4.83556200147964 5.29749377159097 1
4.56658845640625 4.89196245518949 1
4.48400095879768 4.75985656736093 1
4.40486274382875 4.54481596586979 1
 4.5983157410647  4.6633063527499 1
4.21613697444366 4.27119319236915 0
4.85933269685788 5.22239435194727 1
5.45482946950453 5.58846685187143 1
4.57711638544509 5.75163545160762 1
4.16308458121423 5.12147385765405 0
4.43239219833375 5.63630914901879 1
4.24061891018675 3.95994141095672 0
4.59150353045964 4.36424061466438 1
 4.4238299256236 4.64366701993502 1
4.23387907406631 4.11408731220613 0
4.24643357173501 4.59567919453985 0
4.15651344501834 3.41584855122132 0
      5.30327974       4.52308619 0
4.36995937897215 5.02397481688922 1
 4.2998862760854 4.21192219841959 1
4.40494097317701 5.31806141911331 1
4.21481654375488 4.85175014349343 0
4.03830718803659 4.02120942635436 0
4.58089427947384 4.52710956938633 1
4.38193111570181 5.39471843901298 0
4.61430316888822 5.05598752438452 0
4.58372983528239 5.13568258756673 1
4.32904701316943  5.1009343254462 0
3.87086878694955 4.84205633550877 0
5.28211997588777 4.35908807821993 0
4.16397217024611   5.165359248022 0
4.53031167404213 4.57616050269693 1
4.22184512701371 4.53586702071666 0
4.60052320402511 4.90492741175789 1
4.87936255138056 4.61021657371471 1
4.22427574017596 3.92869554332369 1
4.62167922224306  4.8033279953994 1
4.50497389883816 5.65964474079165 0
3.92205834707815 3.86336060341806 0
 4.4382744244106 4.42110529830977 1
 4.2714522966182 4.54426185394201 0
4.87369044814199 6.49479851619584 1
4.09186092973049  4.2139167621973 1
3.97902532519957 5.06118113886074 0
4.15454748356458  4.9283981576689 0
4.53575418873808 4.82913318527197 0
4.84408795722338 4.14175109364261 0
4.84645889890932 5.98064316794055 0
5.09490574890632 4.67737630933263 1
4.88455498346002 4.00766563995804 1
end
label values Lymphocitesinfiltration LymphInfilt
label def LymphInfilt 0 "No", modify
label def LymphInfilt 1 "Yes", modify

N mean Normal Tissue and T tumoral tissue. I am assessing different values of each tissue with regard to the variable "Lymphocitesinfiltration". As both N and T values were measured on each patient, I should use a paired test. I calculated the difference of these values and check normality of the difference to choose between a paired t test or a non-parametric paired t test... As far as I know, when the difference is not normally distributed, you should check simetry. I tried this on SPSS but now I am working with state. When you write "sum Difference_CH2, detail" the standard error of the skewness is not showed. I don't know how to obtain the Std error of the skewness to calculate Skewness/Std error Skewness and then choose between a Wilcoxon test or a sign test... What do you recommend to do in this case?

Comment

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2371
#5

12 Jan 2024, 12:49

What question are you trying to answer? Is lymph infiltration in the tumour tissue greater than in normal tissue?
Comment
Ignacio Herrando

Join Date: Sep 2023

Posts: 10
#6

12 Jan 2024, 13:27

Exactly, greater or lower... it doesn't matter
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1109
#7

12 Jan 2024, 13:56

Hello Ignacio Herrando. Presumably, you want the SE of skewness so that you can carry out a z-test (z = skewness estimate / SE). If so, my advice is to not rely on statistical tests of the assumptions of other statistical tests. I would use graphical methods instead. And when I add these lines to the code you shared in #4...

Code:

generate diff = N_CH2_tau2 - T_CH2_tau2 pnorm diff // Standardized normal probability plot

...I get a pretty good looking normal probability plot.

Finally, given the presence of that dichotomous variable (Lymphocitesinfiltration), I wonder if you really want to estimate a mixed design ANOVA model with Lymphocitesinfiltration as a between-Ss factor and N vs T as a within-Ss factor.

I hope this helps.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2371
#8

12 Jan 2024, 14:59

I would suggest a paired t-test then, or a within-subjects design as Bruce suggested. I’m not at my computer but a mixed model can be used for this approach if you can reorganize your data to have one observation per person per tissue type. The following is an idea but I haven’t tested it as I’m not in front of Stata at the moment.

Code:

mixed count i.tissue_type || person : , resid(ind, by(tissue_type)) reml dfmethod(bwithin)

this should model the counts between tissue types allowing heterogenous variance by tissue type.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17613

13 Jan 2024, 05:24

Ignacio:
as an aside to previous helpful replies:
1) to get the SE of the mean, you should use -mean- instead of -summarize-:

Code:

g diff= N_CH2_tau2- T_CH2_tau2
. mean diff

Mean estimation                             Number of obs = 56

--------------------------------------------------------------
             |       Mean   Std. err.     [95% conf. interval]
-------------+------------------------------------------------
        diff |  -.2905432   .0767005     -.4442545    -.136832
--------------------------------------------------------------

.

2) a paired -ttest- can be easily performed as follows:

Code:

. ttest N_CH2_tau2 == T_CH2_tau2

Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
N_CH2_~2 |      56    4.461725    .0462398    .3460272    4.369058    4.554391
T_CH2_~2 |      56    4.752268    .0771007    .5769685    4.597755    4.906781
---------+--------------------------------------------------------------------
    diff |      56   -.2905432    .0767005     .573974   -.4442545    -.136832
------------------------------------------------------------------------------
     mean(diff) = mean(N_CH2_tau2 - T_CH2_tau2)                   t =  -3.7880
 H0: mean(diff) = 0                              Degrees of freedom =       55

 Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
 Pr(T < t) = 0.0002         Pr(|T| > |t|) = 0.0004          Pr(T > t) = 0.9998

3) a one-sample -ttest- (vs mean=0 as yardstick) gives back, as expected, the very same results:

Code:

. ttest diff == 0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
    diff |      56   -.2905432    .0767005     .573974   -.4442545    -.136832
------------------------------------------------------------------------------
    mean = mean(diff)                                             t =  -3.7880
H0: mean = 0                                     Degrees of freedom =       55

    Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
 Pr(T < t) = 0.0002         Pr(|T| > |t|) = 0.0004          Pr(T > t) = 0.9998

Kind regards,
Carlo
(StataNow 18.5)

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35224
#10

13 Jan 2024, 05:51

Thanks for posting example data. As sidenotes to other answers, I did this

Code:

. gen difference = N - T . . moments diff, by(L) ---------------------------------------------------------------------- Group | n mean SD skewness kurtosis ----------+----------------------------------------------------------- No | 25 -0.330 0.651 0.499 2.107 Yes | 31 -0.259 0.513 -0.569 3.692 ---------------------------------------------------------------------- . . qplot difference, over(L) trscale(invnormal(@)) xla(-2/2) xtitle(standard normal deviate) aspect(1) .

Here moments for SSC is just a convenience wrapper for summarize results and qplot from the Stata Journal allows normal quantile plots to be superimposed, thus going some way beyond what qnorm allows.

One subset is a bit skewed one way and the other the other way, and similar comments apply to tail weight. I would feel happy here that t tests are not too misleading -- in the sense that they answer the question posed. That said, the plot above may not seem clinically convincing.
1 like
Comment
Ignacio Herrando

Join Date: Sep 2023

Posts: 10
#11

15 Jan 2024, 03:51

Thank you all!!! I will try these suggestions!
Comment

Announcement