Create a two-way table for categorical variables

Shima Abdulgader

Join Date: Jul 2024

Posts: 4
#1

Create a two-way table for categorical variables

10 Jul 2024, 04:32

I generated 4 categorical variables with 6 categories each (1-6). The 6 categories are similar across variables but the number of observations differ between them. I want to create a table that summarizes the counts of all 4 variables (rows) with their 6 categories (columns) and also shows zero if no observations.
I used the table command but it shows no observations when I select 2 or more variables to tabulate.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10194
#2

10 Jul 2024, 06:06

Something like this? https://www.statalist.org/forums/for...across-columns
1 like
Comment
Shima Abdulgader

Join Date: Jul 2024

Posts: 4
#3

11 Jul 2024, 04:40

Thank you so much, Andrew! this is helpful. I tried the code using 2 variables and it worked. However, one of the "Timepoints" is missing and the order of the labels is not in order. See my code below.
One more question: When adding the remaining 2 variables, should I add the labels (lab3 & lab4) for the 6 categories?

label values Ultrapos_Cultpos lab1
label def lab1 1 "Baseline", modify
label def lab1 2 "M3", modify
label def lab1 3 "M6", modify
label def lab1 4 "M3-9", modify
label def lab1 5 "M12", modify
label def lab1 6 "M12+", modify
label values Ultrapos_Cultneg lab2
label def lab2 1 "Baseline", modify
label def lab2 2 "M3", modify
label def lab2 3 "M6", modify
label def lab2 4 "M3-9", modify
label def lab2 5 "M12", modify
label def lab2 6 "M12+", modify
frame put *, into(table)
frame table{
decode Ultrapos_Cultpos, gen(TimepointsUltrapos_Cultpos)
decode Ultrapos_Cultneg, gen(TimepointsUltrapos_Cultneg)
keep Timepoints*
gen long obsno=_n
reshape long Timepoints, i(obsno) string
lab define Timepoints 1 "Baseline" 2 "M3" 3 "M6" 4 "M3-9" 5 "M12" 6 "M12+"
lab define testingmethod 1 "Ultrapos_Cultpos" 2 "Ultrapos_Cultneg"
encode _j, gen(testingmethod) label(testingmethod)
*encode Timepoints, gen(Timepoints) label(Timepoints)
dtable i.testingmethod, by(Timepoints)
}
frame drop table

See the output table below

Timepoints
Baseline M12 M12+ M3-9 M6 Total

N 16 (28.1%) 10 (17.5%) 15 (26.3%) 7 (12.3%) 9 (15.8%) 57 (100.0%)
testingmethod
Ultrapos_Cultpos 0 (0.0%) 0 (0.0%) 10 (66.7%) 0 (0.0%) 0 (0.0%) 10 (17.5%)
Ultrapos_Cultneg 16 (100.0%) 10 (100.0%) 5 (33.3%) 7 (100.0%) 9 (100.0%) 47 (82.5%)
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10194
#4

11 Jul 2024, 05:29

I don't have an example to test, but I would unify the values to begin with (before reshape). For example, if the category "X" takes the values 1 in variable "A" and takes the values 3 in variable "B", then recode the values to 3 in "A" or to 1 in "B". Then after reshaping, you can define one unified set of labels. If this is not useful, present a data example which replicates your problem.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#5

11 Jul 2024, 06:21

See also https://www.statalist.org/forums/for...updated-on-ssc

As the thread reveals, groups is available through the Stata Journal.
Comment
Shima Abdulgader

Join Date: Jul 2024

Posts: 4
#6

11 Jul 2024, 07:15

Thank you Nick Cox. I am using Stata 18, which says "command groups is unrecognized".
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

11 Jul 2024, 07:24

As already mentioned, groups is available from the Stata Journal. It is not an official command, so you must install it.

As its author, I was slightly mischievous in using as a name a common English word that might evoke several hits on any search. StataCorp have on occasion laid claim to all words in the English language as possible command names.

So, a direct if not quite predictable way to install it is through typing this search command:

Code:

. search st0496, entry

Search of official help files, FAQs, Examples, and Stata Journals

SJ-18-1 st0496_1  . . . . . . . . . . . . . . . . . Software update for groups
        (help groups if installed)  . . . . . . . . . . . . . . . .  N. J. Cox
        Q1/18   SJ 18(1):291
        groups exited with an error message if weights were specified;
        this has been corrected

SJ-17-3 st0496  . . . . .  Speaking Stata: Tables as lists: The groups command
        (help groups if installed)  . . . . . . . . . . . . . . . .  N. J. Cox
        Q3/17   SJ 17(3):760--773
        presents command for listing group frequencies and percents and
        cumulations thereof; for various subsetting and ordering by
        frequencies, percents, and so on; for reordering of columns;
        and for saving tabulated data to new datasets

Hence at the time of writing the latest public version can be downloaded by first running that command and then clicking on st0496_1 and so on.

Comment

Shima Abdulgader

Join Date: Jul 2024
Posts: 4

11 Jul 2024, 08:07

PID	Baseline_smear	Baseline_Xprt	Baseline_culture	M3_smear	M3_Xprt	M3_culture	M6_smear	M6_Xprt	M6_culture	M3-9_smear	M3-9_Xprt	M3-9_culture
1	0	1	1	0	0	0	0	0	0	0	0	0
2	0	1	0	0	0	0	0	0	0	1	0	0
3	0	0	0	1	0	0	0	0	0	0	0	0
4	1	1	1	0	0	0	0	0	0	0	0	0
5	0	0	0	0	1	1	0	0	0	0	0	0
6	1	0	0	0	0	0	0	0	0	0	0	0
7	1	1	0	0	0	0	0	0	0	0	0	0
8	0	1	1	0	0	0	0	0	0	0	0	0
9	0	1	0	0	0	0	0	0	0	0	0	0
10	0	0	0	1	1	0	0	0	0	0	1	0
11	0	0	0	0	0	0	0	1	1	0	0	0
12	0	0	0	0	1	1	0	0	0	0	0	0
13	0	1	0	0	0	0	0	0	0	0	0	0
14	0	1	0	0	0	0	0	0	0	0	0	0
15	0	1	0	0	0	0	0	0	0	0	0	0
16	0	1	0	0	0	0	0	0	0	0	0	0
17	1	1	1	0	0	0	0	0	0	0	0	0
18	0	0	0	0	0	0	0	1	0	1	0	0
19	0	0	0	0	0	0	0	0	0	0	1	1
20	1	1	1	0	0	0	0	0	0	0	0	0

@Andrew Musau Nick Cox I am not a statistician nor an expert in state, so I'd appreciate it if both of you, using the example above, show me how to get the table using your respective approaches. Below is an explanation of the example:
We are testing TB patients with 3 methods smear, Xpert, and culture at 6 time points (categories) Baseline, M3, M6, and M3-9 (just gave 4 in the example for simplicity). Want to get the frequency of different testing method combinations? Xpert positive_culture positive (v1), Xpert positive_culture negative (v2), and smear positive only (v3). The time points are similar for all patients.

Nick Cox, I managed to install the groups command, thanks.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35698
#9

11 Jul 2024, 08:45

The example data -- whether alluded to or exemplified -- seems to keep changing. Why not try the command yourself and see if it helps?
1 like
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10194

#10

11 Jul 2024, 16:55

As Nick notes, your data description is not consistent with #1. Nevertheless, maybe the following helps.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(pid baseline_smear baseline_xprt baseline_culture m3_smear m3_xprt m3_culture m6_smear m6_xprt m6_culture m39_smear m39_xprt m39_culture)
 1 0 1 1 0 0 0 0 0 0 0 0 0
 2 0 1 0 0 0 0 0 0 0 1 0 0
 3 0 0 0 1 0 0 0 0 0 0 0 0
 4 1 1 1 0 0 0 0 0 0 0 0 0
 5 0 0 0 0 1 1 0 0 0 0 0 0
 6 1 0 0 0 0 0 0 0 0 0 0 0
 7 1 1 0 0 0 0 0 0 0 0 0 0
 8 0 1 1 0 0 0 0 0 0 0 0 0
 9 0 1 0 0 0 0 0 0 0 0 0 0
10 0 0 0 1 1 0 0 0 0 0 1 0
11 0 0 0 0 0 0 0 1 1 0 0 0
12 0 0 0 0 1 1 0 0 0 0 0 0
13 0 1 0 0 0 0 0 0 0 0 0 0
14 0 1 0 0 0 0 0 0 0 0 0 0
15 0 1 0 0 0 0 0 0 0 0 0 0
16 0 1 0 0 0 0 0 0 0 0 0 0
17 1 1 1 0 0 0 0 0 0 0 0 0
18 0 0 0 0 0 0 0 1 0 1 0 0
19 0 0 0 0 0 0 0 0 0 0 1 1
20 1 1 1 0 0 0 0 0 0 0 0 0
end

reshape long baseline m3 m6 m39, i(pid) j(Category) string
rename (baseline m3 m6 m39) var=
reshape long var, i(pid Category) j(which) string
keep if var
replace Category= proper(substr(Category, 2, .))
encode which, g(Condition)
dtable i.Condition, by(Category)

You can work on assigning meaningful labels to your categories.

Res.:

Code:

 
. dtable i.Condition, by(Category)

-----------------------------------------------------
                            Category                 
            Culture    Smear      Xprt       Total   
-----------------------------------------------------
N          9 (24.3%) 9 (24.3%) 19 (51.4%) 37 (100.0%)
Condition                                            
  baseline 5 (55.6%) 5 (55.6%) 12 (63.2%)  22 (59.5%)
  m3       2 (22.2%) 2 (22.2%)  3 (15.8%)   7 (18.9%)
  m39      1 (11.1%) 2 (22.2%)  2 (10.5%)   5 (13.5%)
  m6       1 (11.1%)  0 (0.0%)  2 (10.5%)    3 (8.1%)
-----------------------------------------------------

Announcement