Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • DTable and Collect to Extract Which Test Was Applied For Which Variable

    Hey there Stata friends (specifically Jeff Pitblado (StataCorp)!) -- I'm wondering if anyone has been successful in storing the type of test applied to which variable in a dimension/level within the table/dtable/etable command. As an example, I'd want to create a new table of which test was applied by variable or perhaps notate specific variables received a particular test in the notes.

    Using the auto.dta dataset, I imagine I could do something like:

    Code:
    clear all
    sysuse auto
    
    dtable i.rep78 mpg price, ///
    by(foreign) ///
    cont(mpg, stat(mean sd) test(regress)) ///
    cont(price, stat(median iqr) test(kwallis))
    
    INSERT BUNCH OF COLLECT SUBCOMMANDS TO FORMAT THINGS
    
    collect preview
    I know the original dtable command displays as a "note" all the tests that were done at the top of the initial dtable table preview, but I haven't yet found a way to store that elsewhere. Any ideas?

    Thanks!

  • #2
    Hey there Jeff Pitblado (StataCorp) -- just following up on this! Of course, anyone else please chime in too.

    Comment


    • #3
      Thanks for the example.

      Here is what I came up with.
      Code:
      clear all
      
      sysuse auto
      
      * add -tests- option in -by()-, so that the tests are performed
      dtable i.rep78 mpg price, ///
          by(foreign, tests) ///
          cont(mpg, stat(mean sd) test(regress)) ///
          cont(price, stat(median iqr) test(kwallis))
      
      * get the layout; note -var- is used in the row specification, the only
      * other thing we care about is -result- (currently in the column
      * specification); -foreign- is the -by()- variable that we can now
      * ignore since the test results exist at a unique level of -foreign- for
      * each -var- level
      collect layout
      
      * list all the -result- levels; note levels -regress- and -kwallis- are
      * the names of the test results
      collect label list result, all
      
      * change the header styles so we see the variable names instead of their
      * labels
      collect style header var, level(value)
      
      * change the header styles so we see the result names
      collect style header result, level(value)
      
      * change the row header style to show the headers side-by-side instead
      * of stacked
      collect style row split
      
      * change the layout to show variable names, test results of interest,
      * and their values
      collect layout (var#result[kwallis regress])
      Here is the resulting table.
      Code:
      --------------------
      mpg   regress <0.001
      price kwallis  0.298
      --------------------

      Comment


      • #4
        I forgot about the Pearson test in the example.

        Code:
        clear all
        
        sysuse auto
        
        * add -tests- option in -by()-, so that the tests are performed
        dtable i.rep78 mpg price, ///
            by(foreign, tests) ///
            cont(mpg, stat(mean sd) test(regress)) ///
            cont(price, stat(median iqr) test(kwallis))
        
        * get the layout; note -var- is used in the row specification, the only
        * other thing we care about is -result- (currently in the column
        * specification); -foreign- is the -by()- variable that we can now
        * ignore since the test results exist at a unique level of -foreign- for
        * each -var- level
        collect layout
        
        * list all the -result- levels; note levels -regress- and -kwallis- are
        * the names of the test results
        collect label list result, all
        
        * change the header styles so we see the variable names instead of their
        * labels
        collect style header var, level(value)
        collect style header rep78, title(name) level(hide)
        
        * change the header styles so we see the result names
        collect style header result, level(value)
        
        * change the row header style to show the headers side-by-side instead
        * of stacked; -binder()- here prevents extra column in factor row headers
        collect style row split, binder(=)
        
        * change the layout to show variable names, test results of interest,
        * and their values
        collect layout (var#result[pearson kwallis regress])
        Resulting table.
        Code:
        --------------------
        rep78 pearson <0.001
        mpg   regress <0.001
        price kwallis  0.298
        --------------------

        Comment


        • #5
          Jeff -- you are a rock star. I literally don't know what I would do without DTable and all the amazing work put into it.

          Comment


          • #6
            One more thing, and this may complicate things a bit. Is there a way to add a symbol to the end of the variable label/name (or if not, an additional column that contains a symbol) that could represent which test was used? I could then use the notes section to explain the symbols. I think even better would be specifying WHICH test gets a symbol -- for example, if I only wanted to mark the variables undergoing non-parametric tests.

            Is there any way of storing those tests and associated variables in a local?

            Comment


            • #7
              Before you call dtable, you are in control of which test is applied to each variable you specify. However, I see how it would be nice to know which test was applied to each variable in your table, especially if you were given a collection by a colleague--instead of calling dtable yourself.

              While you cannot automatically provide test-specific augmentations to the variable names in row headers, you can apply string formats to items in the table with the sformat() option of collect style cell.

              Here is an example of how this can be done.
              Code:
              clear all
              sysuse auto
              
              dtable i.rep78 mpg price, ///
                  by(foreign, tests) ///
                  cont(mpg, stat(mean sd) test(regress)) ///
                  cont(price, stat(median iqr) test(kwallis))
              
              * ₁ is unicode character u2081
              collect style cell result[kwallis], sformat("%s₁")
              collect note "Test₁ p-values from Kruskal-Wallis test"
              
              * ₂ is unicode character u2082
              collect style cell result[regress], sformat("%s₂")
              collect note "Test₂ p-values from Wald test"
              
              * ₃ is unicode character u2083
              collect style cell result[pearson], sformat("%s₃")
              collect note "Test₃ p-values from Pearson test"
              
              collect preview
              Here is the resulting table.
              Code:
              --------------------------------------------------------------------------------------
                                                              Car origin                            
                                       Domestic            Foreign              Total          Test
              --------------------------------------------------------------------------------------
              N                           52 (70.3%)          22 (29.7%)         74 (100.0%)        
              Repair record 1978                                                                    
                1                           2 (4.2%)            0 (0.0%)            2 (2.9%) <0.001₃
                2                          8 (16.7%)            0 (0.0%)           8 (11.6%)        
                3                         27 (56.2%)           3 (14.3%)          30 (43.5%)        
                4                          9 (18.8%)           9 (42.9%)          18 (26.1%)        
                5                           2 (4.2%)           9 (42.9%)          11 (15.9%)        
              Mileage (mpg)           19.827 (4.743)      24.773 (6.611)      21.297 (5.786) <0.001₂
              Price              4,782.500 2,050.000 5,759.000 2,641.000 5,006.500 2,147.000  0.298₁
              --------------------------------------------------------------------------------------
              Test₁ p-values from Kruskal-Wallis test
              Test₂ p-values from Wald test
              Test₃ p-values from Pearson test
              Last edited by Jeff Pitblado (StataCorp); 10 Apr 2025, 14:15.

              Comment


              • #8
                This is EXACTLY what I'm looking for. Thanks a ton!!

                Comment

                Working...
                X