Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tabulate a variable to include a value for which there are no observations

    Hi,

    I am trying to tabulate a variable & then use matcell/matrow to build an output table with putexcel. I am having a problem as the variable I am trying to tabulate could have taken a specific value, but this never appears in the data, however, I want this to value/label to show in the output table (with frequency of 0).

    I use "auto.dta" & replace rep78=3 to equal 6 to explain. In this example, say rep78 could have taken any value from 1-6, but only 1-2 and 4-6 appear in the dataset. Is there a way to tabulate this to include the missing category (rep78=3).

    Code:
    // set data
    webuse auto
    replace rep78=6 if rep78==3
    // tabulate
    tabulate rep78, matcell(freq) matrow(names)
    As the category doesn't actually exist, I am unsure as to whether there is a way to make Stata think it does.

    Thank you!

  • #2

    tabulate has no hook of the kind you need, but various community-contributed commands are available that support what you want to some degree.

    Words are stretched here, but in Stata missing means in the dataset but having a missing value, so I commend the term absent for whatever might be in the data but isn't.


    . ssc desc fre

    -----------------------------------------------------------------------------------------------------------------------------------
    package fre from http://fmwww.bc.edu/repec/bocode/f
    -----------------------------------------------------------------------------------------------------------------------------------

    TITLE
    'FRE': module to display one-way frequency table

    DESCRIPTION/AUTHOR(S)

    fre displays, for each specified variable, a univariate
    frequency table containing counts, percent, and cumulative
    percent. Variables may be string or numeric. Labels, in full
    length, and values are printed. By default, fre only tabulates
    the smallest and largest 10 values (along with all missing
    values), but this can be changed. Furthermore, values with zero
    observed frequency may be included in the tables. The default
    for fre is to display the frequency tables in the results
    window. Alternatively, the tables may be written to a file on
    disk, either tab-delimited or LaTeX-formatted.

    KW: data management
    KW: frequencies
    KW: frequency table
    KW: tabulation

    Requires: Stata version 9.2

    Distribution-Date: 20150603

    Author: Ben Jann, University of Bern
    Support: email [email protected]


    INSTALLATION FILES (type net install fre)
    fre.ado
    fre.hlp

    ANCILLARY FILES (type net get fre)
    fre.zip
    -----------------------------------------------------------------------------------------------------------------------------------
    (type ssc install fre to install)

    . ssc desc tabcount

    -----------------------------------------------------------------------------------------------------------------------------------
    package tabcount from http://fmwww.bc.edu/repec/bocode/t
    -----------------------------------------------------------------------------------------------------------------------------------

    TITLE
    'TABCOUNT': module to tabulate frequencies, with zeros explicit

    DESCRIPTION/AUTHOR(S)

    tabcount tabulates frequencies for up to 7 variables. Its main
    distinctive features are that zero frequencies of one or more
    specified values or conditions are always shown in the table
    (i.e. entirely empty rows, columns, etc. are not omitted) and
    that reduced datasets and/or matrices containing the frequencies
    may also be saved. This version of tabcount is 2.0.0 and
    requires Stata 8. A previous version, much restricted, of
    tabcount for Stata 7 is included in this package as tabcount7.

    Distribution-Date: 20040924

    Author: Nicholas J. Cox, University of Durham
    Support: email [email protected]


    INSTALLATION FILES (type net install tabcount)
    tabcount.ado
    tabcount.hlp
    tabcount7.ado
    tabcount7.hlp
    -----------------------------------------------------------------------------------------------------------------------------------
    (type ssc install tabcount to install)

    Comment


    • #3
      Both of these are brilliant - thank you for replying so quickly

      Comment


      • #4
        Tabcount is a great command if one wants to show those variable categories and the value 0. However as far as I researched on these forums there is not a command to integrate tabcount with, for example tab2way or tabulate, in order to be able to show counts, column, row frequencies (and even calculate Chi-square test of association) displaying all categories of variables, including those with zero values. That would be so useful, especially for researchers working in healthcare since there are small samples. Disregarding one category of a variable can have negative impacts on a research.

        Is there any way to put some efforts in developing a code to include this, please? Had I known how to write coding at an advanced level I would have done it myself and publish it here, since many people deal with this issue.

        Comment


        • #5
          I can speak for tabcount -- which I wrote to scratch an itch -- but I haven't touched the code since 2004. The caprice here is that, sorry, but it doesn't interest me any more -- it's an awkward kludge and I don't recall using it myslf in an age -- and so I am unlikely to want to return to it. That said, the code is there for anyone to adapt given the usual protocols. It is not a wrapper for tabulate and -- if I recall correctly -- the whole point is that tabulate just is disinclined to show anything that doesn't exist in the data.

          There is otherwise more than you say. For example, chitesti and tabchii from tab_chi (SSC) work with zero observed frequencies.

          Code:
          . chitesti 9 0 0 \ 3  3 3
          
          observed frequencies from keyboard; expected frequencies from keyboard
          
                   Pearson chi2(2) =  18.0000   Pr =  0.000
          likelihood-ratio chi2(2) =  19.7750   Pr =  0.000
          
            +---------------------------------------------------+
            | observed   expected   notes   obs - exp   Pearson |
            |---------------------------------------------------|
            |        9      3.000   *           6.000     3.464 |
            |        0      3.000   *          -3.000    -1.732 |
            |        0      3.000   *          -3.000    -1.732 |
            +---------------------------------------------------+
          .

          If you want your request to register with StataCorp you'd be best advised to email them directly.

          Comment


          • #6
            Thank you, Nick!

            Since you wrote it, I see in the description that it also shows missing values, but in my case it doesn't.

            tabcount q7x3r3c5, v(0/10) zero miss

            Did I misunderstand that option "miss" shows missing values?

            Thank you!

            Comment


            • #7
              The code has to keep fighting against Stata's inclination to ignore missings. But it's a little subtle. I can't reconstruct why I wrote it one way rather than another, but this series of examples shows some variations.


              Code:
              sysuse auto, clear
              (1978 Automobile Data)
              
              . tabcount rep78, v1(1/5 .)
              
              ----------------------
              Repair    |
              Record    |
              1978      |      Freq.
              ----------+-----------
                      1 |          2
                      2 |          8
                      3 |         30
                      4 |         18
                      5 |         11
                      . |           
              ----------------------
              
              
              
              
              . tabcount rep78, v1(1/5) missing
              
              ----------------------
              Repair    |
              Record    |
              1978      |      Freq.
              ----------+-----------
                      1 |          2
                      2 |          8
                      3 |         30
                      4 |         18
                      5 |         11
              ----------------------
              
              . tabcount rep78, v1(1/5 .) missing
              
              ----------------------
              Repair    |
              Record    |
              1978      |      Freq.
              ----------+-----------
                      1 |          2
                      2 |          8
                      3 |         30
                      4 |         18
                      5 |         11
                      . |          5
              ----------------------

              Comment


              • #8
                That is very helpful. Thank you!

                Comment


                • #9
                  Do you have a recommendation for a similar workaround for -table- as well?

                  Comment


                  • #10
                    #9 The new table is massive and versatile. Writing a work-around to do anything it can't do would be impressive. But is it the case that table can't do what you want? I guess we need at least one specific example. I haven't begin to explore its scope fully yet.

                    Comment


                    • #11
                      Good morning - I am looking at this example from #1:

                      Code:
                      // set data
                      webuse auto
                      replace rep78=6 if rep78==3
                      // table command
                      table rep78

                      I would like to see value 3 with 0 observations. In the -db tables- menu, you will notice it is not possible to select level 3.
                      Click image for larger version

Name:	tables_novalues3.png
Views:	1
Size:	8.0 KB
ID:	1705092

                      Last edited by Anna Powell; 09 Mar 2023, 11:12.

                      Comment


                      • #12
                        That's a nice precise question and I hope someone has an answer.

                        Comment

                        Working...
                        X