Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dtable and ranksum test

    Dear Listers

    I havent work much with stat 18 yet, but have tried to make a dtable using median, and want to test between group differences using ranksum test.

    Code:
    dtable var1 var2 var3, continuous(var1 var2 var3, stat(p5 p50 p95)) by(group)
    how do i include the ranksum test into that?

    Thank you
    Lars

  • #2

    You will need to add option test in option by() and option
    test(kwallis) within option continuous(). The
    Kruskal-Wallis test is a multiple-sample generalization of the
    two-sample (Wilcoxon / Mann-Whitney) rank-sum test.

    Here is an example using the auto data.
    Code:
    sysuse auto
    
    dtable mpg turn price, ///
        nformat(%9.1f p5 p50 p95) ///
        continuous(, stat(p5 p50 p95) test(kwallis)) ///
        by(foreign, test)
    Here is the resulting output.
    Code:
    . sysuse auto
    (1978 automobile data)
    
    .
    . dtable mpg turn price, ///
    >         nformat(%9.1f p5 p50 p95) ///
    >         continuous(, stat(p5 p50 p95) test(kwallis)) ///
    >         by(foreign, test)
    note: using test kwallis across levels of foreign for mpg, turn, and price.
    
    ------------------------------------------------------------------------------------------
                                                     Car origin
                             Domestic              Foreign                Total          Test
    ------------------------------------------------------------------------------------------
    N                            52 (70.3%)            22 (29.7%)           74 (100.0%)
    Mileage (mpg)            14.0 19.0 29.0        17.0 24.5 35.0        14.0 20.0 34.0  0.002
    Turn circle (ft.)        34.0 42.0 48.0        33.0 36.0 38.0        33.0 40.0 46.0 <0.001
    Price             3667.0 4782.5 13594.0 3798.0 5759.0 11995.0 3748.0 5006.5 13466.0  0.298
    ------------------------------------------------------------------------------------------

    Comment


    • #3
      Hello everyone,
      Thank you very much for this previous response, but I was wondering if it was possible in this similar case to use the Kruskal-Wallis test while taking into account ties. In the described case here, Stata does not take ties into account, and we end up with a p-value higher than the ranksum function, which, by default, considers ties.
      Thanks in advance for the response.

      Comment


      • #4
        Both the Kruksall-Wallis and Wicoxon-Mann-Whitney U-test allow for ties. How do you know ties are being excluded?

        Code:
        sysuse auto, clear
        
        kwallis rep78, by(foreign)
        di r(chi2_adj)
        di chi2tail(r(df), r(chi2_adj))
        
        ranksum rep78, by(foreign)
        di (r(z))^2
        di 2*(1 - normal(abs(r(z))))
        Result, noting the square of a Z statistic is equivalent to a χ² statistic with 1 df; and the p-values are the same. There is perhaps an argument that the p-value display of -kwallis- is slightly misleading, but doesn't result in different conclusions.

        Code:
        . kwallis rep78, by(foreign)
        
        Kruskal–Wallis equality-of-populations rank test
        
          +---------------------------+
          |  foreign | Obs | Rank sum |
          |----------+-----+----------|
          | Domestic |  48 |  1317.00 |
          |  Foreign |  21 |  1098.00 |
          +---------------------------+
        
          chi2(1) = 22.410
             Prob = 0.0001
        
          chi2(1) with ties = 25.050
                       Prob = 0.0001
        
        . di r(chi2_adj)
        25.049655
        
        . di chi2tail(r(df), r(chi2_adj))
        5.587e-07
        
        .
        . ranksum rep78, by(foreign)
        
        Two-sample Wilcoxon rank-sum (Mann–Whitney) test
        
             foreign |      Obs    Rank sum    Expected
        -------------+---------------------------------
            Domestic |       48        1317        1680
             Foreign |       21        1098         735
        -------------+---------------------------------
            Combined |       69        2415        2415
        
        Unadjusted variance     5880.00
        Adjustment for ties     -619.69
                             ----------
        Adjusted variance       5260.31
        
        H0: rep78(foreign==Domestic) = rep78(foreign==Foreign)
                 z = -5.005
        Prob > |z| = 0.0000
        Exact prob = 0.0000
        
        . di (r(z))^2
        25.049655
        
        . di 2*(1 - normal(abs(r(z))))
        5.587e-07

        Comment


        • #5
          I was referring to the "kwallis" option in the "dtable" function, which does not account for adjustments for ties.

          Code:
          sysuse auto
          
          dtable headroom, ///
              continuous(, stat(p5 p50 p95) test(kwallis)) ///
              by(foreign, test)
              
          ranksum headroom, by (foreign)
          kwallis headroom, by (foreign)
          Code:
          . dtable headroom, ///
          >     continuous(, stat(p5 p50 p95) test(kwallis)) ///
          >     by(foreign, test)
          note: using test kwallis across levels of foreign for headroom.
          
          --------------------------------------------------------------------------
                                                  Car origin                        
                              Domestic          Foreign            Total        Test
          --------------------------------------------------------------------------
          N                     52 (70.3%)        22 (29.7%)       74 (100.0%)      
          Headroom (in.) 1.500 3.500 4.500 2.000 2.500 3.500 1.500 3.000 4.500 0.012
          --------------------------------------------------------------------------
          
          
          
          . ranksum headroom, by (foreign)
          
          Two-sample Wilcoxon rank-sum (Mann–Whitney) test
          
               foreign |      Obs    Rank sum    Expected
          -------------+---------------------------------
              Domestic |       52      2162.5        1950
               Foreign |       22       612.5         825
          -------------+---------------------------------
              Combined |       74        2775        2775
          
          Unadjusted variance     7150.00
          Adjustment for ties     -204.15
                               ----------
          Adjusted variance       6945.85
          
          H0: headroom(foreign==Domestic) = headroom(foreign==Foreign)
                   z =  2.550
          Prob > |z| = 0.0108
          Exact prob = 0.0102
          
          . kwallis headroom, by (foreign)
          
          Kruskal–Wallis equality-of-populations rank test
          
            +---------------------------+
            |  foreign | Obs | Rank sum |
            |----------+-----+----------|
            | Domestic |  52 |  2162.50 |
            |  Foreign |  22 |   612.50 |
            +---------------------------+
          
            chi2(1) =  6.316
               Prob = 0.0120
          
            chi2(1) with ties =  6.501
                         Prob = 0.0108
          Sorry, I may have explained it incorrectly.



          Comment


          • #6
            Currently dtable computes the p-value using the r(chi2)
            result after kwallis. In a future update we hope to add support
            for using r(chi2_adj) instead.

            In the mean time, you can override the kwallis test result with
            your own p-value provided you tag it properly. Here is an example using
            the auto data example. Note that I change the numeric format so we can
            see 4 decimal places in the reported p-value.
            Code:
            sysuse auto
            
            dtable headroom, ///
                continuous(, stat(p5 p50 p95) test(kwallis)) ///
                nformat(%9.4f _dtable_test) ///
                by(foreign, test)
            
            * perform/collect alternate test -- will need to be done for each
            * continuous variable
            kwallis headroom, by(foreign)
            collect get kwallis_ties=(chi2tail(r(df),r(chi2_adj))), ///
                tags(foreign[_dtable_test] var[headroom])
            
            * remove kwallis_ties from autolevels
            collect query autolevels result
            collect style autolevels result _dtable_stats _dtable_test, clear
            
            * redefine composite using custom test result
            collect query composite _dtable_test
            collect composite define _dtable_test = kwallis_ties, trim replace
            
            * replay table with changes
            collect layout
            Here is the resulting table.
            Code:
            ---------------------------------------------------------------------------
                                                    Car origin
                                Domestic          Foreign            Total        Test
            ---------------------------------------------------------------------------
            N                     52 (70.3%)        22 (29.7%)       74 (100.0%)
            Headroom (in.) 1.500 3.500 4.500 2.000 2.500 3.500 1.500 3.000 4.500 0.0108
            ---------------------------------------------------------------------------
            I recognize this kind of modification can be intimidating, given the
            need to loop over the continuous variables and for changes to an undocumented
            composite result, so I wrote a program that will perform the mod
            provided you give it the by() variable.
            Code:
            program dtable_mod_kwallis_ties
                version 18
                syntax [varlist(numeric default=none)] , by(varname)
            
                if `:length local varlist' == 0 {
                    * find all the continuous variables in the -var- dimension
                    quietly collect levelsof var
                    local varlist `"`s(levels)'"'
                    unab alist : *
                    local varlist : list varlist & alist
                }
            
                foreach var of local varlist {
                    * perform kwallis, account for ties in p-value
                    quietly kwallis `var', by(`by')
                    collect get ///
                        kwallis_ties=(chi2tail(r(df),r(chi2_adj))), ///
                        tags(`by'[_dtable_test] var[`var'])
                }
                * remove new result kwallis_ties from it's autolevels
                quietly collect query autolevels result
                local levels = s(levels)
                local levels : subinstr local levels "kwallis_ties" ""
                collect style autolevels result `levels', clear
            
                * replace -kwallis- with -kwallis_ties- in _dtable_test
                quietly collect query composite _dtable_test
                local elements = s(elements)
                local elements : subinstr local ///
                    elements "kwallis" "kwallis_ties", word
                collect composite define _dtable_test = `elements', trim replace
            
                * replay table
                collect preview
            end
            Using this program the above example shortens to:
            Code:
            . sysuse auto
            (1978 automobile data)
            
            .
            . dtable headroom, ///
            >         continuous(, stat(p5 p50 p95) test(kwallis)) ///
            >         nformat(%9.4f _dtable_test) ///
            >         by(foreign, test)
            note: using test kwallis across levels of foreign for headroom.
            
            ---------------------------------------------------------------------------
                                                    Car origin
                                Domestic          Foreign            Total        Test
            ---------------------------------------------------------------------------
            N                     52 (70.3%)        22 (29.7%)       74 (100.0%)
            Headroom (in.) 1.500 3.500 4.500 2.000 2.500 3.500 1.500 3.000 4.500 0.0120
            ---------------------------------------------------------------------------
            
            .
            . dtable_mod_kwallis_ties, by(for)
            
            ---------------------------------------------------------------------------
                                                    Car origin
                                Domestic          Foreign            Total        Test
            ---------------------------------------------------------------------------
            N                     52 (70.3%)        22 (29.7%)       74 (100.0%)
            Headroom (in.) 1.500 3.500 4.500 2.000 2.500 3.500 1.500 3.000 4.500 0.0108
            ---------------------------------------------------------------------------

            Comment


            • #7
              Thank you very much Jef for your explanations and for the program.
              It works perfectly

              Comment


              • #8
                Sorry Jeff for bothering you again, but I don't understand why, when modifying the presentation by putting N under the variable name in your code, it removes the new p-value.

                Code:
                sysuse auto
                
                dtable headroom, ///
                    continuous(, stat(p5 p50 p95) test(kwallis)) ///
                    sample(, place(seplabels)) ///
                    nformat(%9.4f _dtable_test) ///
                    by(foreign, test)
                
                * perform/collect alternate test -- will need to be done for each
                * continuous variable
                kwallis headroom, by(foreign)
                collect get kwallis_ties=(chi2tail(r(df),r(chi2_adj))), ///
                    tags(foreign[_dtable_test] var[headroom])
                
                * remove kwallis_ties from autolevels
                collect query autolevels result
                collect style autolevels result _dtable_stats _dtable_test, clear
                
                * redefine composite using custom test result
                collect query composite _dtable_test
                collect composite define _dtable_test = kwallis_ties, trim replace
                
                * replay table with changes
                collect layout
                Code:
                                    
                            Car origin                   
                        Domestic    Foreign        Total      
                        52 (70.3%)    22 (29.7%)    74    (100.0%)   
                                    
                Headroom    (in.)    1.500 3.500 4.500    2.000 2.500 3.500    1.500    3.000 4.500
                Thanks in advance for the response.

                Comment


                • #9
                  When you put the sample statistics in the column header with option place(seplabels), dtable adds dimension _dtable_sample_dim to the column specification. This dimension contains the sample statistics in the labels for its values. You need to add _dtable_sample_dim[_hide] to option tags() when you collect the value[s] for result kwallis_ties.

                  In the following modified example, look for EDIT in blue.
                  Code:
                  sysuse auto
                  
                  dtable headroom, ///
                      continuous(, stat(p5 p50 p95) test(kwallis)) ///
                      sample(, place(seplabels)) ///
                      nformat(%9.4f _dtable_test) ///
                      by(foreign, test)
                  
                  * perform/collect alternate test -- will need to be done for each
                  * continuous variable
                  kwallis headroom, by(foreign)
                  collect get kwallis_ties=(chi2tail(r(df),r(chi2_adj))), ///
                      tags(foreign[_dtable_test] var[headroom] _dtable_sample_dim[_hide]) // <-- EDIT
                  
                  * remove kwallis_ties from autolevels
                  collect query autolevels result
                  collect style autolevels result _dtable_stats _dtable_test, clear
                  
                  * redefine composite using custom test result
                  collect query composite _dtable_test
                  collect composite define _dtable_test = kwallis_ties, trim replace
                  
                  * replay table with changes
                  collect layout
                  Here is the resulting table.
                  Code:
                  ---------------------------------------------------------------------------
                                                          Car origin                         
                                      Domestic          Foreign            Total        Test 
                                     52 (70.3%)        22 (29.7%)       74 (100.0%)          
                  ---------------------------------------------------------------------------
                  Headroom (in.) 1.500 3.500 4.500 2.000 2.500 3.500 1.500 3.000 4.500 0.0108
                  ---------------------------------------------------------------------------

                  Comment


                  • #10
                    ok
                    Thank you for the explanations.

                    Comment


                    • #11
                      FYI, continuous test kwallis_ties is now available in dtable -- it is part of today's update (14feb2024) to Stata 18.

                      3. dtable with option continuous(, test(ctest)) now supports continuous test (ctest)
                      kwallis_ties for reporting the p-value of the Kruskal-Wallis rank test adjusted
                      for ties.
                      Last edited by Jeff Pitblado (StataCorp); 14 Feb 2024, 09:48.

                      Comment

                      Working...
                      X