Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summary statistics for binary variable: how to display frequencies

    Hello,

    This might be an elementary question but I am struggling to find a way to create a summary statistic table that includes both continuous and binary variables. For binary variables, I realize that the summary command doesn't give much meaningful data. Is there a way for me to include proportions for one specific variable in the summary table while including means and sd for the rest? Or is there a way for me to include proportions for all variables with the summarize command? My code is as follows:

    bys mam: eststo: quietly estpost summarize age_mig education speaks_english
    esttab * using sum.tex, replace

    Both mam and speaks_english are binary variables here. Instead of getting the mean of the binary variable indicating English-speaking ability, I wanted to see the proportion of those who speak english in each category of mam.

    Any help would be greatly appreciated.


  • #2
    There is no conflict here. Suppose in a small subset of size 10 on a binary variable. you have values 0 0 0 1 1 1 1 1 1 1 so that the proportion of the positive category (sometimes conventionally called "success" regardless of whether belonging to it is desirable) is 0.7 and the proportion of the zero category ("failure") is 0.3. The mean is also inevitably and helpfully 0.7. Once you know the mean you know one proportion and (because it is a binary variable) once you know one proportion you know the other. It's the same information either way.

    If you want to mix styles, however. you will need to go beyond summarize or edit your output accordingly.

    Comment


    • #3
      Maybe you can try the command baselinetable.

      Code:
      net install st0524_2, from("http://www.stata-journal.com/software/sj20-3") replace
      
      sysuse nlsw88.dta, clear
      baselinetable age(cts) race married wage(cts) ttl_exp(cts), by(collgrad, totalcolumn)

      Comment


      • #4
        Thank you Nick Cox. That is helpful insight, I had not thought of that before. I guess, I just wanted to display it in terms of percentages for easier reading so I ended using frequency tables.

        Also thank you Will ZHANG! I see that baselinecommand gives frequencies as well but I think there is a limit on the number of observations it can take. Appreciate the feedback!

        Comment

        Working...
        X