Use dtable to display means of one variable by levels of the categorical demographic variables

amandacpac

Join Date: Sep 2014

Posts: 55
#1

Use dtable to display means of one variable by levels of the categorical demographic variables

13 Jul 2023, 21:00

Hi statalisters! I have to say, I am IN LOVE with dtable!! I'm in the process of creating a table similar to a table 1, but, instead of frequencies of the categorical variables, I would like the mean (SD) of a different variable (specifically my exposure variable). Is there an easy way to do this??

For example, using the the auto.dta dataset, can I get a table of means (SDs) of price by rep78 and foreign?

Thanks!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

13 Jul 2023, 22:05

Code:

sysuse auto, clear dtable, continuous(price, stat(mean sd)) by(rep78) nformat(%3.2f mean sd) dtable, continuous(price, stat(mean sd)) by(foreign) nformat(%3.2f mean sd)

will give you two separate tables, one by rep78, the other by foreign. If you want a table breaking it down simultaneously by both variables, then you have two options. You could get -dtable- to do the job by creating a variable -egen stratum = group(foreign rep78)- and use -dtable-, -by(stratum)-. But with 10 such strata, the table will be a mess. A better approach would be to use -table- to do a cross-tabulation.

Code:

table (rep78) (foreign), stat(mean price) stat(sd price) nformat(%3.2f mean sd) /// sformat("(%s)" sd) style(Table-1)
Comment
amandacpac

Join Date: Sep 2014

Posts: 55
#3

14 Jul 2023, 10:32

Yeah, I want it all in one table, and I have a lot of variables, so I will use table. Thank you!
Comment
amandacpac

Join Date: Sep 2014

Posts: 55
#4

14 Jul 2023, 10:44

Actually, that isn't quite what I'm looking for. I was hoping to get a tru Table 1 style table. I have attached an example of the table I am looking for. This one has Median (Q1, Q3), instead of Mean (SD). I would actually prefer median (Q1,Q3). I also have 3 groups and about 10-15 categorical variables.

Attached Files

Last edited by amandacpac; 14 Jul 2023, 10:52.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#5

14 Jul 2023, 12:41

I have attached an example of the table I am looking for.

Fine. I do hope you realize that this is nothing like what you asked for in #1.

So this is something that d-table can do. Something like this:

Code:

webuse nhanes2, clear dtable, continuous(age height weight, stat(q1 q2 q3) test(regress)) /// nformat(%2.1f q1 q2 q3) /// factor(sex heartatk diabetes, stat(fvfreq fvpercent) test(pearson)) /// nformat(%2.1f fvpercent) /// by(race, nototals tests)
Comment
amandacpac

Join Date: Sep 2014

Posts: 55
#6

14 Jul 2023, 12:53

I'm sorry, this isn't quite what I'm asking for. Using the nhanes data, what I would like, for example, would be the median ages in the places where you have the frequencies of sex, prior heart attack, diabetes status, etc. So, what is the median age among white male, white female, white people that have had no prior heart attack, white people that have had a prior heart attack, white people that are not diabetic, and then the same for Black and Other. Does that make sense?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#7

14 Jul 2023, 14:42

I'm sorry, but I can't visualize that from your description. Can you draw it?
Comment
amandacpac

Join Date: Sep 2014

Posts: 55
#8

14 Jul 2023, 15:03

Ok, so, for my data I want the distribution of the exposure variable (a continuous variable) by the sociodemographic variables I have in Table 1 (just the categorical variables, which is most of them). For the NHANES data, I have made an example table using age as the continuous "exposure" variable, and sex, prior heart attack, and diabetes as the categorical sociodemographic variables. Here is a picture of that table I created. To get this data, I used univar age if race==1, by(sex)for each race and then re did that for the other two categorical variables.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#9

14 Jul 2023, 15:41

I don't think this can be done in -dtable-. With -table- I can get the contents contents you wish, but I don't know how to get the first and third quartiles in parentheses. This is as close to what you're asking for as I can get. Maybe somebody else following the thread can put the finishing touches on.

Code:

table ( sex heartatk diabetes ) ( race ), nototals statistic(q1 age) statistic(q2 age) statistic(q3 age)

Last edited by Clyde Schechter; 14 Jul 2023, 15:43.
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 699

#10

15 Jul 2023, 00:36

Here is the code I used to reproduce the table in #8.

Code:

webuse nhanes2l

table () (race), ///
        statistic(frequency) ///
        statistic(percent) ///
        append ///
        nototals

collect addtags sample[N], fortags(result[frequency percent])
collect style header sample, title(hide)

local catvars sex heartatk diabetes

foreach c of local catvars {
        table (`c') (race) , ///
                statistic(q1 age) ///
                statistic(q2 age) ///
                statistic(q3 age) ///
                append ///
                nototals
}

collect style cell result[percent], nformat(%5.1f) sformat("(%s%%)")

collect composite define iqi = q1 q3, trim delimiter(", ")
collect style cell result[iqi], sformat("(%s)")

collect composite define show = frequency percent q2 iqi, trim

collect style header result, title(hide) level(hide)

collect layout (sample `catvars') (race#result[show])

Here is the resulting table from my Stata session.

Code:

-----------------------------------------------------------------
                   |                      Race                   
                   |          White           Black         Other
-------------------+---------------------------------------------
N                  |  9,065 (87.6%)   1,086 (10.5%)    200 (1.9%)
Sex                |                                             
  Male             |    49 (31, 63)   46 (28.5, 63)   43 (28, 63)
  Female           |    50 (31, 64)     45 (29, 62)   43 (27, 62)
Prior heart attack |                                             
  No heart attack  |    48 (31, 63)     44 (28, 61)   43 (27, 62)
  Had heart attack |  64 (60, 69.5)     65 (59, 69)   67 (65, 69)
Diabetes status    |                                             
  Not diabetic     |    48 (31, 63)     44 (28, 61)   41 (27, 61)
  Diabetic         |  64 (55.5, 69)     63 (52, 68)   65 (61, 68)
-----------------------------------------------------------------

Announcement

Use dtable to display means of one variable by levels of the categorical demographic variables

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment