-table-: report categories with frequency 0

Emil Alnor

Join Date: Jun 2021

Posts: 130
#1

-table-: report categories with frequency 0

18 Nov 2022, 03:08

I am trying to make a tables from a survey dataset. I have a lot of variables representing likert-scaled survey questions, where 1 "Strongly disagree", 2 "Partly disagree", 3 "Neither agree nor disagree", 4 "Partly agree", 5 "Strongly agree", 6 "Don't know", 7 "Not relevant". What I want is tables showing the frequency of respondents having answered those categories. I need all tables to include all categories. Problem is that for some variables (questions) nobody answered 1 "Strongly disagree" nor 6 "Don't know". I still need the table to show that 0 and 0% answered that, i.e., that the frequency for values 1 and 6 on variable s18 is 0. This is not the default in -table- and apperently adding the -, zerocounts- options doesn't help. Here is what I tried:

Code:

collect table () s18, zerocounts putexcel a1 = collect

And here is what my data looks like:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte s18 4 5 4 5 5 5 4 5 7 5 4 . . 5 5 5 . 2 4 5 3 5 . 5 5 . . 5 5 5 5 . 5 2 5 5 5 5 5 4 5 . 4 5 5 2 . 5 5 5 5 . 5 5 . 5 . 5 5 7 5 5 5 5 5 5 4 5 3 5 5 5 5 4 end

What code do I need to write in order to produce a table, that shows that the frequency for values 1 and 6 on variable s18 is 0, and then export this to Excel?

Last edited by Emil Alnor; 18 Nov 2022, 03:11.
Tags: putexcel, tables

Leonardo Guizzetti

Join Date: Jul 2016
Posts: 2389

18 Nov 2022, 08:11

This is tricky, and I view it as a general limitation with the built-in tabulation commands (new and old table, tab, tabstat). Here are some approaches with the following toy dataset:

Code:

input byte(grp fq)
1 4
2 2
3 0
end

expand fq
gen byte wt = fq>0
drop fq
list

Result

Code:

. list

     +----------+
     | grp   wt |
     |----------|
  1. |   1    1 |
  2. |   2    1 |
  3. |   3    0 |
  4. |   1    1 |
  5. |   1    1 |
     |----------|
  6. |   1    1 |
  7. |   2    1 |
     +----------+

1) First create a variable that takes the value of 1 for any observation, or 0 for any missing category. In the example dataset, there are no observations from grp=3, but it is represented with a value of wt=0 to ensure that the level of grp exists. Then tabulate the sum of that variable.

Code:

. table (grp) (), stat(sum wt) stat(percent wt)

--------------------------
        |  Total   Percent
--------+-----------------
grp     |                 
  1     |      4     66.67
  2     |      2     33.33
  3     |      0      0.00
  Total |      6    100.00
--------------------------

2) If you only want to summarize one variable at a time, then the user-contributed command -fre- can help. Note, with -fre- you don't need an expanded dataset.

Code:

. fre grp [fw=wt], include(1/3)

grp
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   1     |          4      66.67      66.67      66.67
        2     |          2      33.33      33.33     100.00
        3     |          0       0.00       0.00     100.00
        Total |          6     100.00     100.00           
-----------------------------------------------------------

This will ensure that, at least, any level listed in -inlcude()- will appear in the table, including the possibility of zero counts. It will also return the results in r().

3) Another option is to build the table "manually" create the table contents, export it, and if necessary, adjust formatting.

Code:

. collapse (sum) n=wt , by(grp)
. sum n, meanonly
. gen pct = 100 * n / r(sum)
. list

     +---------------------+
     | grp   n         pct |
     |---------------------|
  1. |   1   4   66.666667 |
  2. |   2   2   33.333333 |
  3. |   3   0           0 |
     +---------------------+

Comment

George Ford

Join Date: Aug 2014
Posts: 3120

18 Nov 2022, 08:11

Code:

putexcel set putexcel2.xlsx, sheet(example1) replace
putexcel A1 = "Likert Value"
putexcel B1 = "Count"
local row = 2
forv i = 1/7 {
    putexcel A`row' = `i'
    summ s18 if s18==`i', meanonly
    putexcel B`row' = `r(N)'
    local ++row
}

Comment

Emil Alnor

Join Date: Jun 2021

Posts: 130
#4

21 Nov 2022, 02:55

Thanks both!

George Ford just some follow up questions, if you would be so kind:
1. Regarding the 4th line in the loop "putexcel B`row' = `r(N)'" I was wondering why you put "`'" around "r(N)". Not doing it returns "N: invalid cell name" so I suppose it is to make Stata read r(N) as the stored result of -summ- instead of as a row number?
2. Is "local ++row" the equivalent to adding "local row = row + 1" at the end of the loop?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35432
#5

21 Nov 2022, 03:34

https://journals.sagepub.com/doi/epu...36867X20976340

https://journals.sagepub.com/doi/epu...6867X211063415

are a couple of tutorials on loops. ++ is discussed in the second.
1 like
Comment
George Ford

Join Date: Aug 2014

Posts: 3120
#6

21 Nov 2022, 07:49

In response:
1. Sometimes Stata lets you use r(N) and sometimes it wants `r(N)' (or any r(*) return). Look at p. 5 of this (https://www.stata.com/manuals/rstoredresults.pdf) for a description. For me, it's usually trial and error.
2. Yes. the ++ is the same as adding a value to the local in a loop. A handy shortcut. (Such handy coding is why I troll statalist).
1 like
Comment

Announcement