Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -table-: report categories with frequency 0

    I am trying to make a tables from a survey dataset. I have a lot of variables representing likert-scaled survey questions, where 1 "Strongly disagree", 2 "Partly disagree", 3 "Neither agree nor disagree", 4 "Partly agree", 5 "Strongly agree", 6 "Don't know", 7 "Not relevant". What I want is tables showing the frequency of respondents having answered those categories. I need all tables to include all categories. Problem is that for some variables (questions) nobody answered 1 "Strongly disagree" nor 6 "Don't know". I still need the table to show that 0 and 0% answered that, i.e., that the frequency for values 1 and 6 on variable s18 is 0. This is not the default in -table- and apperently adding the -, zerocounts- options doesn't help. Here is what I tried:

    Code:
    collect table () s18, zerocounts
    putexcel a1 = collect
    And here is what my data looks like:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte s18
    4
    5
    4
    5
    5
    5
    4
    5
    7
    5
    4
    .
    .
    5
    5
    5
    .
    2
    4
    5
    3
    5
    .
    5
    5
    .
    .
    5
    5
    5
    5
    .
    5
    2
    5
    5
    5
    5
    5
    4
    5
    .
    4
    5
    5
    2
    .
    5
    5
    5
    5
    .
    5
    5
    .
    5
    .
    5
    5
    7
    5
    5
    5
    5
    5
    5
    4
    5
    3
    5
    5
    5
    5
    4
    end

    What code do I need to write in order to produce a table, that shows that the frequency for values 1 and 6 on variable s18 is 0, and then export this to Excel?
    Last edited by Emil Alnor; 18 Nov 2022, 03:11.

  • #2
    This is tricky, and I view it as a general limitation with the built-in tabulation commands (new and old table, tab, tabstat). Here are some approaches with the following toy dataset:

    Code:
    input byte(grp fq)
    1 4
    2 2
    3 0
    end
    
    expand fq
    gen byte wt = fq>0
    drop fq
    list
    Result

    Code:
    . list
    
         +----------+
         | grp   wt |
         |----------|
      1. |   1    1 |
      2. |   2    1 |
      3. |   3    0 |
      4. |   1    1 |
      5. |   1    1 |
         |----------|
      6. |   1    1 |
      7. |   2    1 |
         +----------+
    1) First create a variable that takes the value of 1 for any observation, or 0 for any missing category. In the example dataset, there are no observations from grp=3, but it is represented with a value of wt=0 to ensure that the level of grp exists. Then tabulate the sum of that variable.

    Code:
    . table (grp) (), stat(sum wt) stat(percent wt)
    
    --------------------------
            |  Total   Percent
    --------+-----------------
    grp     |                 
      1     |      4     66.67
      2     |      2     33.33
      3     |      0      0.00
      Total |      6    100.00
    --------------------------

    2) If you only want to summarize one variable at a time, then the user-contributed command -fre- can help. Note, with -fre- you don't need an expanded dataset.

    Code:
    . fre grp [fw=wt], include(1/3)
    
    grp
    -----------------------------------------------------------
                  |      Freq.    Percent      Valid       Cum.
    --------------+--------------------------------------------
    Valid   1     |          4      66.67      66.67      66.67
            2     |          2      33.33      33.33     100.00
            3     |          0       0.00       0.00     100.00
            Total |          6     100.00     100.00           
    -----------------------------------------------------------
    This will ensure that, at least, any level listed in -inlcude()- will appear in the table, including the possibility of zero counts. It will also return the results in r().


    3) Another option is to build the table "manually" create the table contents, export it, and if necessary, adjust formatting.

    Code:
    . collapse (sum) n=wt , by(grp)
    . sum n, meanonly
    . gen pct = 100 * n / r(sum)
    . list
    
         +---------------------+
         | grp   n         pct |
         |---------------------|
      1. |   1   4   66.666667 |
      2. |   2   2   33.333333 |
      3. |   3   0           0 |
         +---------------------+

    Comment


    • #3
      Code:
      putexcel set putexcel2.xlsx, sheet(example1) replace
      putexcel A1 = "Likert Value"
      putexcel B1 = "Count"
      local row = 2
      forv i = 1/7 {
          putexcel A`row' = `i'
          summ s18 if s18==`i', meanonly
          putexcel B`row' = `r(N)'
          local ++row
      }

      Comment


      • #4
        Thanks both!

        George Ford just some follow up questions, if you would be so kind:
        1. Regarding the 4th line in the loop "putexcel B`row' = `r(N)'" I was wondering why you put "`'" around "r(N)". Not doing it returns "N: invalid cell name" so I suppose it is to make Stata read r(N) as the stored result of -summ- instead of as a row number?
        2. Is "local ++row" the equivalent to adding "local row = row + 1" at the end of the loop?

        Comment


        • #5
          https://journals.sagepub.com/doi/epu...36867X20976340

          https://journals.sagepub.com/doi/epu...6867X211063415

          are a couple of tutorials on loops. ++ is discussed in the second.

          Comment


          • #6
            In response:
            1. Sometimes Stata lets you use r(N) and sometimes it wants `r(N)' (or any r(*) return). Look at p. 5 of this (https://www.stata.com/manuals/rstoredresults.pdf) for a description. For me, it's usually trial and error.
            2. Yes. the ++ is the same as adding a value to the local in a loop. A handy shortcut. (Such handy coding is why I troll statalist).

            Comment

            Working...
            X