Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • n and % for a series of binary variables

    Hi. I woud like to produce n and % for a series of binary variables in the same table.

    For example, instead of:
    Code:
    . sysuse nlsw88, clear
    (NLSW, 1988 extract)
    
    . tab1 married collgrad union
    
    -> tabulation of married  
    
        married |      Freq.     Percent        Cum.
    ------------+-----------------------------------
         single |        804       35.80       35.80
        married |      1,442       64.20      100.00
    ------------+-----------------------------------
          Total |      2,246      100.00
    
    -> tabulation of collgrad  
    
    college graduate |      Freq.     Percent        Cum.
    -----------------+-----------------------------------
    not college grad |      1,714       76.31       76.31
        college grad |        532       23.69      100.00
    -----------------+-----------------------------------
               Total |      2,246      100.00
    
    -> tabulation of union  
    
          union |
         worker |      Freq.     Percent        Cum.
    ------------+-----------------------------------
       nonunion |      1,417       75.45       75.45
          union |        461       24.55      100.00
    ------------+-----------------------------------
          Total |      1,878      100.00
    I would like to produce something like:

    Code:
    Variable     |      Freq.     Percent
    -------------+------------------------
         married |      1,442       64.20
    college grad |        532       23.69
           union |        461       24.55
    -------------+------------------------       
    or
    
    Variable         |      Freq.     Percent
    -----------------+------------------------
             married |      1,442       64.20
    college graduate |        532       23.69
        union worker |        461       24.55
    -----------------+------------------------
    i.e. the counts and percentage of the variable=1 for each variable, either labelled by the value label or the variable label (either would work in my case).

    With best wishes and thaks,

    Jane

  • #2
    if each of the variables of interest is coded 0/1, just use -summarize-; the mean will be the percent;

    Comment


    • #3
      Thanks, Rich. Unfortunately, that uses the variable names for the row labels and I wanted the variable label (or value label for variable=1). Also, I am producing the tables for someone else, and I think that they would appreciate the counts and percentages, rather than just the percentages/100.

      Comment


      • #4
        Here's a start at getting the table you want. I'm out of time, but this is a start.
        Code:
        sysuse nlsw88, clear
        rename (married collgrad union) (v_=)
        keep v_*
        foreach var of varlist v_* {
            local lbl : label (`var') 1
            generate str20 l`var' = `"`lbl'"'
            }
        generate id = _n
        reshape long v_ lv_, i(id) j(var) string
        rename v_ value
        rename lv_ label
        generate missing = value==.
        generate count   = value==1
        generate pct     = 100*(value==1) if !missing(value)
        collapse (sum) missing count (mean) pct, by(label)
        gsort -count
        format %9.2f pct
        list, noobs
        Code:
        . list, noobs
        
          +----------------------------------------+
          |        label   missing   count     pct |
          |----------------------------------------|
          |      married         0    1442   64.20 |
          | college grad         0     532   23.69 |
          |        union       368     461   24.55 |
          +----------------------------------------+
        Last edited by William Lisowski; 25 Nov 2019, 07:42.

        Comment


        • #5
          Rich Goldstein is right. Here is some further code. I wouldn't be surprised at an existing command, or at its non-existence either! Naturally there is much scope for tuning details of presentation, etc.

          Code:
          sysuse nlsw88 
          
          preserve 
          
          local j = 0 
          foreach v in married collgrad union { 
          
              local ++j 
              local this`j' `"`: label (`v') 1'"' 
              if `"`this`j''"' == "" local this`j' "`v'" 
              local tostack `tostack' `v'
              
          } 
          
          stack `tostack', into(data) clear 
          collapse (sum) Freq = data (mean) mean = data, by(_stack) 
          gen Percent = string(100 * mean, "%3.2f") + "%" 
          
          forval J = 1/`j' { 
              label def _stack `J' `"`this`J''"', modify 
          } 
              
          label val _stack _stack 
          label var _stack "attribute" 
          label var Freq "Freq." 
          
          tabdisp _stack, c(Freq Percent) 
          
          restore

          Code:
          -------------------------------------
             attribute |      Freq.     Percent
          -------------+-----------------------
               married |       1442      64.20%
          college grad |        532      23.69%
                 union |        461      24.55%
          -------------------------------------

          Comment


          • #6
            Hereabouts it is proverbial that beggars can't be choosers!

            Comment


            • #7
              Thanks, Nick and William!
              Edited to add: this beggar is really very grateful.

              Comment


              • #8
                Thanks; the hint was for your colleague if needed!

                Comment


                • #9
                  Another solution would be use -file- :

                  Code:
                  .  sysuse nlsw88, clear
                  (NLSW, 1988 extract)
                  
                  .  file open myfile using test.txt, write replace 
                  
                  .  file write myfile "Attribute" _column(20) "Freq." _column(30) "Percent" _n
                  
                  .  foreach v of varlist married union collgra {
                    2.         qui sum `v'
                    3.         local per = r(mean)*100
                    4.         file write myfile  "`v'" _column(15) %10.0gc (r(sum)) ///
                  >                 _column(30) %4.1f (`per') _n
                    5. }
                  
                  . 
                  . file close myfile
                  
                  . type test.txt
                  Attribute          Freq.     Percent
                  married            1,442     64.2
                  union                461     24.5
                  collgrad             532     23.7

                  Comment


                  • #10
                    Thanks, Scott. I am spoilt for choice!

                    Comment


                    • #11
                      Hi, This is my first time posting here, so I appreciate your kindness>
                      I have a dataset with observations at individual-level, and I want to create a new variable (% of a therapy) at a center-level.
                      I have the therapy variable as (0-1) and the variable that identifies the center-level I want to aggregate in as a string variable.
                      Please help, I have tried bysort, egen, count with no satisfactory results.

                      Comment


                      • #12
                        Carlos Diaz

                        Welcome to Statalist.

                        You have posted your question as a reply to a topic it is not related to. If it hasn't been answered by the time this post is complete, please repost your question by going to the index page for the General Forum at

                        https://www.statalist.org/forums/for...ussion/general

                        and clicking on the "+ New Topic" button at the top.

                        Before doing so, you might take a few moments to review the Statalist FAQ linked to from the top of the page. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to use the dataex command to provide sample data, as described in section 12 of the FAQ.

                        The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

                        Comment

                        Working...
                        X