Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • catplot revised on SSC: Plots of frequencies, fractions or percents of categorical data

    Thanks as ever to Kit Baum. a new version of catplot is available on SSC. If you're interested in plotting categorical data and have never heard of catplot you should

    GOTO to COMEFROM

    and skip or at most skim this preamble. If you know about catplot and want to know more about why it has been revised, the full story is longer than I want to tell, but here's a synopsis.

    Stata has supported bar charts from early days but with a bias towards plotting variables you have already, or summaries of them like means. Plotting bar (or dot) charts for categorical data is a two-step problem, trivial though the two steps may seem: calculate frequencies, proportions, or percents as new variables; and then draw your charts. But such charts were not so well supported.

    In the 1990s various user-written (community-contributed as we now say) commands were posted as extensions to fill the gap. In 2003, Stata's official graphics was completely revamped in Stata 8, yet charts for categorical data were still not well supported. But as in some of those 1990s extras, it was evident that something like

    Code:
    gen freq = 1
    and then say

    Code:
    graph bar (sum) freq, over(foo) over(bar)
    would get you there in two steps in terms of a chart of frequencies. Proportions and percents are harder, although the code looks all too obvious once you have thought it out. There is no fun in doing that the second time, let alone repeatedly, so catplot was born as a wrapper and posted on SSC early in 2003. The syntax was a bit awkward, so I acted quickly and posted a revised version in 2010. The syntax was still a bit awkward so I am again acting quickly to post a revised version 14 years later, prompted partly by a recent kind comment from a Stata friend that the command remains useful.

    Meanwhile, StataCorp also acted quickly and added graph bar (count) and graph bar (percent) and their equivalents for hbar and dot in 2014 in the life of Stata 13. I have not completely mastered either as new syntax, given that I already knew how to do it with other commands. And as catplot seems to remain popular, whether the reasons for that are good or not so good, I thought that a re-issue was overdue.

    COMEFROM from GOTO

    What's new in this release?

    catplot remains a wrapper for official graph commands, but the syntax is closer to that of official Stata.

    The help file is much extended with more examples.

    I've added some discouragement from adding asyvars or asyvars stack which in my prejudiced view are often steps in a poor direction.

    Here's a taster in terms of examples of its use. I use stcolor in Stata 18. If you're working with a previous version or prefer a different scheme, some tweaks may seem desirable, as well as a different scheme specification.

    Code:
    . set scheme stcolor 
    
    * Read in data
    
    . sysuse auto, clear
    
    *  Horizontal bar chart showing category frequencies
    
    . catplot, over(rep78) name(G1, replace)
    Click image for larger version

Name:	catplot_G1.png
Views:	1
Size:	41.4 KB
ID:	1764385

    Code:
    * Horizontal bar chart showing category percents
    
     . catplot, over(rep78) percent name(G2, replace)
    (graph not shown here, but similar to previous in shape)

    Code:
      *  Given foreign or domestic, what is percent breakdown of repair record?
    
        . catplot, over(rep78) over(foreign) percent(foreign) name(G3, replace)
    Click image for larger version

Name:	catplot_G3.png
Views:	1
Size:	48.1 KB
ID:	1764386


    Code:
       * Given repair record, what is percent breakdown of foreign or domestic?
    
        . catplot, over(foreign) over(rep78) percent(rep78) name(G4, replace)
    Click image for larger version

Name:	catplot_G4.png
Views:	1
Size:	64.4 KB
ID:	1764387

    Code:
     * And show the percents as numeric text? (You may need to add some space)
    
        . catplot, over(foreign) over(rep78) percent(rep78) blabel(bar, format(%02.0f)) ysc(r(0 105)) name(G5, replace)

    Click image for larger version

Name:	catplot_G5.png
Views:	1
Size:	69.9 KB
ID:	1764388

    Code:
     
    
       *  Show that as a side-by-side display - and add an axis title too
    
        . catplot, by(foreign, l1title(Repair record 1978)) over(rep78) percent(rep78) blabel(bar, format(%02.0f)) ysc(r(0 105)) name(G6, replace)
    Click image for larger version

Name:	catplot_G6.png
Views:	1
Size:	73.4 KB
ID:	1764389








  • #2
    Some puzzling reports at https://www.statalist.org/forums/for...to-have-broken were diagnosed as follows:

    1. Some people were trying the old syntax with the new command code -- which should have failed absolutely.

    2. Yet catplot didn't throw them out, but persisted and gave some bizarre results.

    3. That was my bug, as the incorrect syntax was being ignored -- incorrectly.

    Thanks as always to Kit Baum, a fixed .ado file is now visible on SSC.

    If you downloaded the new catplot in response to #1 and were not bitten, fine and good, but please update your package at your convenience.

    Comment


    • #3
      Can the revised -catplot- provide an option to allow bars of total? Just like the second plot below.

      Code:
      sysue auto
      catplot_new, over(rep78) over(foreign) percent(foreign) blabel(bar, format(%9.2f))
      Click image for larger version

Name:	Graph1.png
Views:	1
Size:	121.7 KB
ID:	1769853

      Click image for larger version

Name:	Graph2.png
Views:	1
Size:	136.8 KB
ID:	1769854


      Comment


      • #4
        I don't know how you got that graph -- or indeed what is catplot_new except something of your own?

        This works, but catplot doesn't support by(, total) with percents as you might hope. That limitation should be documented at some point .

        Code:
        sysuse auto, clear 
        
        set scheme stcolor 
        
        preserve 
        
        expand 2 , gen(toexpand)
        
        replace foreign = 2 if toexpand 
        
        label def origin 2 Total, add 
        
        catplot ,  over(rep78) over(foreign) percent(foreign) blabel(bar, format(%3.2f)) 
        
        restore

        Comment


        • #5
          Dear Nick Cox, thank you very much. And sorry for the confusion about -catplot_new- which is renamed by myself to distinguish the 2010 version and 2024 version of catplot. By the way, I'm more familiar with the grammer of old version catplot.

          Comment

          Working...
          X