table: Summary stats - Adding range, effect size, and p value

Chris Martin

Join Date: Nov 2015

Posts: 95
#1

table: Summary stats - Adding range, effect size, and p value

21 May 2021, 09:24

I'm trying to build a table using the new table command. I want the first column of the table to contain the variable range, e.g, "1-4" or "1-7". This needs to the theoretical maximum, so even if the highest actual score is 6.8, it should be 1-7. I call this Step 1.

I want the intermediate columns to be the count mean and sd by gender. I've got this working perfectly (Step 2).

And then at the end of the table I need the effect size and p values pertain to the difference between male and female levels. I'm doing this manually as well (Steps 3 and 4). Is there a way to use table so that steps 1, 3 and 4 are done by the command?

Code:

* 1. manually get ranges and put in first column * 2. get count mean sd table (var) (female), /// statistic(count a b c) /// statistic (mean a b c) /// statistic(sd a b c) /// nformat(%4.0f count) nformat(%4.2f mean sd) nototals collect levelsof result collect style header result row stack, level(hide) collect layout (var) (female[0 1]#result) *3. get effect sizes as eta squared *get eta squares foreach var in a b c{ quietly anova `var' female qui estat esize di e(r2) } *4. get p values foreach var in a b c{ quietly anova `var' female local pModel = Ftail(e(df_m),e(df_r),e(F)) display `pModel' }

Thanks.
Tags: None

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 686

21 May 2021, 11:11

Here is an example using simulated data inspired by your original code:

Code:

* simulate some data
set seed 17
set obs 1000
gen female = runiformint(0,1)
label define female 0 "Male" 1 "Female"
label values female female
gen a = female + runiformint(1,7) + runiform() if runiform() > .05
gen b = female/6 + runiformint(1,4) + runiform() if runiform() > .05
gen c = runiformint(2,6) - runiform() if runiform() > .05

* compute and collect ranges
collect create ranges
foreach var in a b c {
    summarize `var'
    local min = floor(r(min))
    local max = ceil(r(max))
    collect get range="`min' - `max'", tag(var[`var'])
}
collect label levels result range "Range", modify
collect layout (var) (result)

* compute and collect statistics
table (var) (female result), ///
    name(stats) ///
    statistic(count a b c) ///
    statistic(mean a b c) ///
    statistic(sd a b c) ///
    nformat(%4.0f min max) ///
    nformat(%4.0f count) ///
    nformat(%4.2f mean sd) ///
    nototals
collect label levels result count "N" sd "SD", modify
collect preview

* compute and collect effects
collect create fits
foreach var in a b c {
    collect e(r2) ///
        p = Ftail(e(df_m),e(df_r),e(F)) ///
        , ///
        tag(var[`var']) ///
        : anova `var' female
}
collect style cell result[r2], nformat(%4.2f)
collect style cell result[p], nformat(%6.3f)
collect label levels result ///
    p "p-value" ///
    r2 "Effect size" ///
    , modify
collect layout (var) (result)

* combine, arrange, and customize
collect combine full = ranges stats fits
collect layout (var) (result[range] female#result result[r2 p])
collect style header female, title(hide)
collect preview

Here is the resulting table

Code:

. collect preview

--------------------------------------------------------------------------
  |  Range          Male               Female        Effect size   p-value
  |            N   Mean     SD     N   Mean     SD                        
--+-----------------------------------------------------------------------
a |  1 - 9   469   4.59   2.02   477   5.53   2.00          0.05     0.000
b |  1 - 6   472   3.02   1.14   472   3.20   1.16          0.01     0.016
c |  1 - 6   479   3.58   1.47   477   3.60   1.47          0.00     0.811
--------------------------------------------------------------------------

Comment

Chris Martin

Join Date: Nov 2015

Posts: 95
#3

21 May 2021, 11:59

Thanks! This works great. I added collect clear at the top to make it reusable.

On my machine, this line does not produce an error but does not have any effect either:

Code:

collect style header female, title(hide)
Comment
Chris Martin

Join Date: Nov 2015

Posts: 95
#4

21 May 2021, 12:29

I have a follow-up question. I have a factor variable called racea, which hold race. I am trying to create a table in which just a subset of values of racea is to be used. I have tried usign the format listed in the help file but regardless of where I put, say, race[1 3]. I get all races.
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 686
#5

21 May 2021, 13:55

Regarding #3 above, that line was part of my experimenting to get the final table. I did not realize it was not necessary when I posted.

Hiding the dimension titles is already in the default style for collections ranges and fits, and is carried forward in the combined style when collection full is created.
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 686
#6

21 May 2021, 13:57

Did you specify race[1 3] or racea[1 3]?
collect does not support variable abbreviations.
Comment
Chris Martin

Join Date: Nov 2015

Posts: 95
#7

24 May 2021, 09:03

It turns out that was was the problem. Thanks for spotting it!
Comment
Chris Martin

Join Date: Nov 2015

Posts: 95
#8

24 May 2021, 09:08

Regarding #5, what would I have to do here to get the dimension title to show, assuming I wanted that? I tried a couple of things, but neither worked.

Code:

* end: combine, arrange, and customize collect combine full = ranges stats fits collect layout (var) (result[range] racea[1 3]#result result[r2 p]) collect style header racea, title(???) collect preview
Comment

Raymond Zhang

Join Date: Jan 2021
Posts: 349

24 May 2021, 12:35

Originally posted by Chris Martin View Post

Regarding #5, what would I have to do here to get the dimension title to show, assuming I wanted that? I tried a couple of things, but neither worked.

Code:

* end: combine, arrange, and customize
collect combine full = ranges stats fits
collect layout (var) (result[range] racea[1 3]#result result[r2 p])
collect style header racea, title(???)
collect preview

@Chris Martin If you want to show the dimension title, you can use the codes below:

Code:

collect combine full = ranges stats fits collect layout (var) (result[range] female#result result[r2 p]) collect style header female, title(name) collect preview -------------------------------------------------------------------------- | Range female Effect size p-value | Male Female | N Mean SD N Mean SD --+----------------------------------------------------------------------- a | 1 - 9 469 4.59 2.02 477 5.53 2.00 0.05 0.000 b | 1 - 6 472 3.02 1.14 472 3.20 1.16 0.01 0.016 c | 1 - 6 479 3.58 1.47 477 3.60 1.47 0.00 0.811 --------------------------------------------------------------------------

Best
Raymond

Best regards.

Raymond Zhang
Stata 17.0,MP

Comment

Raymond Zhang

Join Date: Jan 2021
Posts: 349

#10

24 May 2021, 12:54

Maybe you can make it more beautiful.You can add a line below Male and Female.

Code:

. collect style cell cell_type[column-header]#female,border(bottom,pattern(single))
. collect preview

--------------------------------------------------------------------------
  |  Range          Male               Female        Effect size   p-value
  |         ---------------------------------------                       
  |            N   Mean     SD     N   Mean     SD                        
--+-----------------------------------------------------------------------
a |  1 - 9   469   4.59   2.02   477   5.53   2.00          0.05     0.000
b |  1 - 6   472   3.02   1.14   472   3.20   1.16          0.01     0.016
c |  1 - 6   479   3.58   1.47   477   3.60   1.47          0.00     0.811
--------------------------------------------------------------------------

Best regards.

Raymond Zhang
Stata 17.0,MP

Comment

Chris Martin

Join Date: Nov 2015

Posts: 95
#11

28 Jun 2021, 11:05

Thanks. Sorry for the rather late reply but is there a way to insert a break in the line between male and female, i.e., have a thin column there with no horizontal line?
Comment

Mohammad Azmain Iktidar

Join Date: May 2021
Posts: 11

#12

01 Apr 2023, 18:42

Hello, I want to add a p-value column to this table where the p-value from relevant tests (i.e., for 'age' from t-test and for 'gender' and 'course' from chi-square test) will be displayed. What should be my code?

Code:

-----------------------------------------
                   Substance Use        
                No               Yes    
-----------------------------------------
Age       22.1     (1.9)   23.6     (2.1)
                                        
Gender                                  
  Male     413   (42.4%)    126   (81.8%)
  Female   562   (57.6%)     28   (18.2%)
Course                                  
  MBBS     922   (94.6%)    146   (94.8%)
  BDS       53    (5.4%)      8    (5.2%)
-----------------------------------------

The code I used was:

Code:

table (var) (substanceuse), statistic(mean age) statist (sd age) statistic(fvfrequency gender curriculum ) statistic(fvpercent gender curriculum ) nototals
collect recode result mean = column1 sd=column2 fvfrequency = column1 fvpercent   = column2
collect layout (var) (substanceuse#result[column1 column2])
collect style cell var[gender curriculum ]#result[column1], nformat(%6.0fc)
collect style cell var[gender curriculum ]#result[column2], nformat(%6.1f) sformat("(%s%%)")
collect style cell var[age]#result[column1 column2], nformat(%6.1f)
collect style cell var[age]#result[column2], sformat("(%s)")
collect style header result, level(hide)
collect style row stack, nobinder spacer
collect style cell border_block, border(right, pattern(nil))
collect preview

Thank you!

Stata 17

Announcement