tab frequency by order

jykim

Join Date: Apr 2014

Posts: 20
#1

tab frequency by order

17 Apr 2014, 14:28

Hi, is there any way to show the frequency table by descending order of frequency?

For, example, I want to see the below table from the highest frequency to the lowest frequency.

Thank you.

|
| A B | Total
-----------+----------------------+----------
-1 | 94 24 | 118
64 | 1 0 | 1
160 | 579 65 | 644
329 | 1 0 | 1
635 | 155 172 | 327
...
6970 | 44 7 | 51
7138 | 0 1 | 1
715
-----------+----------------------+----------
Total | 3,057 713 | 3,770

Pearson chi2(61) = 782.0937 Pr = 0.000
Tags: None
Phil Schumm

Join Date: Mar 2014

Posts: 169
#2

17 Apr 2014, 14:44

Use the sort option to the tabulate command.
1 like
Comment
jykim

Join Date: Apr 2014

Posts: 20
#3

17 Apr 2014, 15:21

Thank you for your help.
However, sort option only works in oneway table.
Is there any to use that function in twoway table?
Thank you.
Comment
Joe Canner

Join Date: Mar 2014

Posts: 580
#4

17 Apr 2014, 15:52

Leaving aside the question of why one would need to do such a thing, you could use the matcell() and matrow() options:

Code:

tab rowvar colvar, matcell(C) matrow(R) matrix A=R,C,C[1...,1]+C[1...,2] mata: st_matrix("A", sort(st_matrix("A"),-4)) matrix list A
1 like
Comment
Phil Schumm

Join Date: Mar 2014

Posts: 169
#5

17 Apr 2014, 16:03

My apologies (I should have read more carefully). I'm not aware of any "off the shelf" command, but Nick showed how to tackle this problem in his Speaking Stata column entitled "On numbers and strings". For example:

Code:

sysuse nlsw88 decode industry, gen(foo) bysort foo : gen freq = -_N egen foobar = axis(freq foo), label(foo) tab foobar race

Note that axis() is part of Nick's egenmore package, which you can install with ssc install egenmore.

Last edited by Phil Schumm; 17 Apr 2014, 16:15.
1 like
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35211

17 Apr 2014, 17:45

Consider a multiway table with I x J x K ... categories formed by cross-combination of any number of variables. Then groups (SSC) collapses that table to a one long table. Sorting by frequency is a key option.

Here is an example:

Code:

. webuse nlswork, clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. groups race union south collgrad

  +---------------------------------------------------+
  | race   union   south   collgrad   Freq.   Percent |
  |---------------------------------------------------|
  |    1       0       0          0    5499     28.59 |
  |    1       0       0          1    1344      6.99 |
  |    1       0       1          0    3115     16.19 |
  |    1       0       1          1     818      4.25 |
  |    1       1       0          0    1587      8.25 |
  |---------------------------------------------------|
  |    1       1       0          1     681      3.54 |
  |    1       1       1          0     415      2.16 |
  |    1       1       1          1     134      0.70 |
  |    2       0       0          0     963      5.01 |
  |    2       0       0          1     143      0.74 |
  |---------------------------------------------------|
  |    2       0       1          0    2367     12.31 |
  |    2       0       1          1     311      1.62 |
  |    2       1       0          0     726      3.77 |
  |    2       1       0          1     132      0.69 |
  |    2       1       1          0     658      3.42 |
  |---------------------------------------------------|
  |    2       1       1          1     131      0.68 |
  |    3       0       0          0     102      0.53 |
  |    3       0       0          1      39      0.20 |
  |    3       0       1          0      20      0.10 |
  |    3       0       1          1       6      0.03 |
  |---------------------------------------------------|
  |    3       1       0          0      25      0.13 |
  |    3       1       0          1      18      0.09 |
  |    3       1       1          1       1      0.01 |
  +---------------------------------------------------+

. groups race union south collgrad, order(high)

  +---------------------------------------------------+
  | race   union   south   collgrad   Freq.   Percent |
  |---------------------------------------------------|
  |    1       0       0          0    5499     28.59 |
  |    1       0       1          0    3115     16.19 |
  |    2       0       1          0    2367     12.31 |
  |    1       1       0          0    1587      8.25 |
  |    1       0       0          1    1344      6.99 |
  |---------------------------------------------------|
  |    2       0       0          0     963      5.01 |
  |    1       0       1          1     818      4.25 |
  |    2       1       0          0     726      3.77 |
  |    1       1       0          1     681      3.54 |
  |    2       1       1          0     658      3.42 |
  |---------------------------------------------------|
  |    1       1       1          0     415      2.16 |
  |    2       0       1          1     311      1.62 |
  |    2       0       0          1     143      0.74 |
  |    1       1       1          1     134      0.70 |
  |    2       1       0          1     132      0.69 |
  |---------------------------------------------------|
  |    2       1       1          1     131      0.68 |
  |    3       0       0          0     102      0.53 |
  |    3       0       0          1      39      0.20 |
  |    3       1       0          0      25      0.13 |
  |    3       0       1          0      20      0.10 |
  |---------------------------------------------------|
  |    3       1       0          1      18      0.09 |
  |    3       0       1          1       6      0.03 |
  |    3       1       1          1       1      0.01 |
  +---------------------------------------------------+

There was an early discussion of groups in

SJ-3-4 pr0011 . . . . . . . . Speaking Stata: Problems with tables, Part II
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/03 SJ 3(4):420--439 (no commands)
reviews three user-written commands (tabcount, makematrix,
and groups) as different approaches to tabulation problems

which is freely available at http://www.stata-journal.com/sjpdf.h...iclenum=pr0011

but that is not up-to-date. The 2012 version now at SSC is documented in moderate detail in its own help

The chi-square test detail is naturally something you can get already from tabulate (and indeed in more detail from tabchi from tab_chi on SSC).

Comment

bd asare

Join Date: Mar 2015

Posts: 2
#7

10 Mar 2015, 17:15

i also learnt cross tabulation from this same platform elsewhere before becoming a member by this simple command:
bysort a: b c
i was struggling to get a good cross tabulation for about three variables: study towns, visits rate and practices

my example and output looked like this:
command
bysort towns: tab2 monitoring_visitsrate composite_hygienelevel, chi2 exact

output
towns = ejura

-> tabulation of monitoring_visitsrate by composite_hygienelevel

Enumerating sample-space combinations:
stage 3: enumerations = 1
stage 2: enumerations = 17
stage 1: enumerations = 0

monitoring_visits | composite_hygienelevel
rate | above ave average below ave | Total
------------------+---------------------------------+----------
Frequent visits | 3 6 4 | 13
Infrequent visits | 3 19 33 | 55
No visits | 1 0 6 | 7
------------------+---------------------------------+----------
Total | 7 25 43 | 75

Pearson chi2(4) = 9.3517 Pr = 0.053
Fisher's exact = 0.025

------------------------------------------------------------------------------------------------
-> towns = mankranso

-> tabulation of monitoring_visitsrate by composite_hygienelevel

Enumerating sample-space combinations:
stage 3: enumerations = 1
stage 2: enumerations = 4
stage 1: enumerations = 0

monitoring_visits | composite_hygienelevel
rate | above ave average below ave | Total
------------------+---------------------------------+----------
Frequent visits | 9 14 16 | 39
Infrequent visits | 1 4 5 | 10
No visits | 1 0 0 | 1
------------------+---------------------------------+----------
Total | 11 18 21 | 50

Pearson chi2(4) = 4.4263 Pr = 0.351
Fisher's exact = 0.466
Comment
bd asare

Join Date: Mar 2015

Posts: 2
#8

10 Mar 2015, 17:57

i also learnt cross tabulation from this same platform elsewhere before becoming a member by this simple command:
bysort a: b c
i was struggling to get a good cross tabulation for about three variables: study towns, visits rate and practices

my example and output looked like this:
command
bysort towns: tab2 monitoring_visitsrate composite_hygienelevel, chi2 exact

output
towns = A

-> tabulation of monitoring_visitsrate by composite_hygienelevel

Enumerating sample-space combinations:
stage 3: enumerations = 1
stage 2: enumerations = 17
stage 1: enumerations = 0

monitoring_visits | composite_hygienelevel
rate | above ave average below ave | Total
------------------+---------------------------------+----------
Frequent visits | 3 6 4 | 13
Infrequent visits | 3 19 33 | 55
No visits | 1 0 6 | 7
------------------+---------------------------------+----------
Total | 7 25 43 | 75

Pearson chi2(4) = 9.3517 Pr = 0.053
Fisher's exact = 0.025

------------------------------------------------------------------------------------------------
-> towns = B

-> tabulation of monitoring_visitsrate by composite_hygienelevel

Enumerating sample-space combinations:
stage 3: enumerations = 1
stage 2: enumerations = 4
stage 1: enumerations = 0

monitoring_visits | composite_hygienelevel
rate | above ave average below ave | Total
------------------+---------------------------------+----------
Frequent visits | 9 14 16 | 39
Infrequent visits | 1 4 5 | 10
No visits | 1 0 0 | 1
------------------+---------------------------------+----------
Total | 11 18 21 | 50

Pearson chi2(4) = 4.4263 Pr = 0.351
Fisher's exact = 0.466
Comment
Juan Price Elton

Join Date: Jan 2019

Posts: 47
#9

27 Sep 2021, 19:41

Hello: I jumped in in this discussion very late (the last comment is from 2015). I used the solution proposed by Nick in his post#6 and it worked (thanks Nick). The table I get by using this method is good but I need to ask Stata to: (i) show only those observations that have a frequency higher than a certain number, and (ii) to display the table in let's say 10 lines with breaks (I know how to do this with "list" but not with groups) because it is very long so when stata produces it then I can scroll up from the bottom and cannot see the entire table. I hope my explanation is not too confusing. Thank you. Juan
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35211

#10

28 Sep 2021, 03:44

With groups (now from the Stata Journal) you can do this. I am not clear what your (ii) means but I have made a guess.

Code:

. webuse nlswork , clear
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)


. groups birth_yr , select(f >= 10) sep(10)

  +------------------------------------+
  | birth_yr   Freq.   Percent     %<= |
  |------------------------------------|
  |       41      26      0.09    0.09 |
  |       42     574      2.01    2.10 |
  |       43    1522      5.33    7.44 |
  |       44    2095      7.34   14.78 |
  |       45    2311      8.10   22.88 |
  |       46    2707      9.49   32.36 |
  |       47    3040     10.65   43.02 |
  |       48    3017     10.57   53.59 |
  |       49    3095     10.85   64.44 |
  |       50    2718      9.53   73.96 |
  |------------------------------------|
  |       51    2765      9.69   83.65 |
  |       52    2722      9.54   93.19 |
  |       53    1935      6.78   99.98 |
  +------------------------------------+

Code:

Comment

Juan Price Elton

Join Date: Jan 2019

Posts: 47
#11

22 Feb 2022, 21:36

Nick: I really appreciate your help (your proposed solution worked super well and was exactly what I was looking for). By the way, I apologise for not replying earlier. Thank you so much. Juan
Comment

Announcement