Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouped bar graphs

    Hi,

    I’d like to draw graphs with grouped bars. I’d like to have one grouped bar graph with the number of cases (absolute frequencies) and one with the column percentage (relative frequencies).
    In general Stata
    bar graphs only compare means, not counts or (column) percentages.

    Click image for larger version

Name:	Bar graph.png
Views:	1
Size:	97.4 KB
ID:	1391267

    Does anyone know how to do it?

    Thanks in advance
    Sabine

  • #2
    Please post an excerpt from your data to give us something to experiment with. You can do this with the dataex package, described in the FAQ.

    Comment


    • #3
      In general Stata bar graphs only compare means, not counts or (column) percentages
      I don't think so. Please take a look at this excerpt from the Stata Manual.

      All in all, the best strategy to get an insightful reply is acting according to Friedrich's advice given in #2.

      That said, I believe you may produce both graphs separately, - name - both, then - combine - so as to reach your goal

      Hopefully that helps.
      Best regards,

      Marcos

      Comment


      • #4
        Please excuse my late reply, I didn't have access to Stata during the weekend.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input long id byte(marstat east)
          1 3 1
          2 1 0
          3 2 0
          4 1 0
          5 1 1
          6 1 0
          7 1 0
          8 2 0
          9 1 0
         10 1 0
         11 1 0
         12 1 1
         13 2 0
         14 1 1
         15 2 0
         16 1 1
         17 1 1
         18 1 0
         19 1 1
         20 2 0
         21 2 0
         22 1 1
         23 1 1
         24 2 0
         25 1 1
         26 1 1
         27 3 1
         28 1 0
         29 2 1
         30 1 1
         31 1 0
         32 2 0
         33 1 0
         34 1 0
         35 1 0
         36 1 0
         37 1 0
         38 1 0
         39 1 0
         40 2 0
         41 2 0
         42 2 0
         43 1 0
         44 2 0
         45 1 0
         46 2 0
         47 1 0
         48 1 0
         49 2 0
         50 1 0
         51 2 0
         52 2 0
         53 1 0
         54 1 0
         55 1 0
         56 1 1
         57 1 1
         58 1 1
         59 1 1
         60 2 0
         61 2 0
         62 2 0
         63 2 0
         64 2 0
         65 1 1
         66 1 1
         67 2 1
         68 2 0
         69 1 1
         70 1 1
         71 3 0
         72 1 0
         73 1 0
         74 1 0
         75 2 0
         76 1 0
         77 1 0
         78 2 0
         79 2 1
         80 1 0
         81 1 1
         82 1 0
         83 1 0
         84 1 1
         85 1 0
         86 1 0
         87 2 0
         88 1 0
         89 1 0
         90 1 0
         91 1 0
         92 1 1
         93 1 0
         94 2 0
         95 2 0
         96 1 0
         97 2 0
         98 1 1
         99 1 0
        100 2 0
        end
        label values id misslab
        label values marstat marstat_ac1
        label def marstat_ac1 1 "1 Never married", modify
        label def marstat_ac1 2 "2 Married/civil union", modify
        label def marstat_ac1 3 "3 Divorced/dissolved civil union", modify
        label values east east_ac1
        label def east_ac1 0 "0 No", modify
        label def east_ac1 1 "1 Yes", modify
        Separate graphs are fine as well.

        Kind regards
        Sabine

        Comment


        • #5
          I tried so many (complex) versions and unfortunately missed the simple and right one – don't ask me why ...

          Code:
          graph bar, over(east) over(marstat)
          graph bar (count), over(east) over(marstat)
          Special thanks to Marcos!

          So I still have to find out, how to color the bars for east and west differently and how to move the Yes/No bar labels from the axis to the legend.

          Kind regards
          Sabine

          Comment


          • #6
            It's me again. I realised that
            Code:
            graph bar, over(east) over(marstat)
            shows the row percentage. I'm looking for the column percentage. It occurs when I change east and marstat but than the yes and no bars aren't next to each other.

            Comment


            • #7
              Thanks for the data example. I think you'll find that the results of these commands differ:

              Code:
               
               graph bar, over(east) over(marstat)  
               graph bar, over(marstat) over(east)
              Personally I prefer explicit control of how percents are calculated. That's possible for example with catplot (SSC) and tabplot (SJ). They're both mine but for showing graphs of two-way tables of categorical data, as here, I lean increasingly to the latter.

              Ideally you can access http://www.stata-journal.com/article...article=gr0066 If not then http://www.statalist.org/forums/foru...updated-on-ssc gives an overview, but in either case install tabplot files from the Stata Journal site after search tabplot

              This code follows your data example code.

              Code:
              graph bar, over(east) over(marstat)
              graph bar,  over(marstat) over(east)
              
              tabplot marstat east , percent(east) showval
              tabplot marstat east , percent(east) showval xla(1 "West" 2 "East") ///
              ytitle("") xtitle("")
              
              bysort marstat east : gen count = _N
              bysort east : gen total = _N
              gen show = string(count) + "  " + string(100 * count/total, "%2.1f") + "%"
              
              tabplot marstat east , percent(east) showval(show) xla(1 "West" 2 "East") ///
              ytitle("") xtitle("") subtitle(% by origin) bfcolor(none)
              Click image for larger version

Name:	eastwest.png
Views:	1
Size:	18.6 KB
ID:	1391514



              The last example (graph above) is the most elaborate. My large tip is that hybrid graphical tables often work well.

              Small tips include:

              0. Let the axes carry the labels. Separate keys or legends are an evil to be avoided if possible.

              1. The axis titles with variable names should often be removed in favour of a verbal explanation in your caption.

              2. tabplot allows customised bar labels. Here by showing absolute counts and percents I suggest that you may not need two graphs.


              Comment


              • #8
                Thanks Nick, for your help. The "problem" with the color and the labels can be solved by asyvar .
                Click image for larger version

Name:	Marstat.png
Views:	1
Size:	28.2 KB
ID:	1391527

                The overlapping labels can be managed by relabel . And with
                Code:
                catplot east marstat , recast(bar) percent(east) asyvar
                I get the column percentage.
                Click image for larger version

Name:	Marstat_perc.png
Views:	1
Size:	29.2 KB
ID:	1391528


                Perfect! Now I'm happy!

                Comment


                • #9
                  You're happy, but no supervisor, mentor, boss, examiner or reviewer worthy of respect will like those overlapping labels.

                  You know a way to fix them, but to reduce or even remove the problem use the catplot default of horizontal bars.

                  (Naturally, the tabplot graph in #7 doesn't have this problem.)

                  Comment


                  • #10
                    Nick, I totally agree that overlapping labels are a real bad style.
                    But I like the vertical catplot-command more. Is in your command an equivalent to the relabel-subcommand:
                    Code:
                    graph bar (count), over(east, relabel(1"West" 2"East")) over(marstat, relabel(1"Single" 2"Married" 3"Separated" 4"Widowed"))
                    which avoids the overlapping labels?

                    Comment


                    • #11
                      De gustibus non est disputandum, but I'd suggest a consideration of the eye movements required to understand the vertical form, popular though it is.

                      Comment


                      • #12
                        Shall Sabine prefer to use the horizontal form, I suggest fiddling with the font size of the legend as well as the bar width.

                        Last but not least, I don't see any good reason to keep the numbers with labels at the same time.

                        Well, I said "last" a couple of words ago, but I fear I have another "last" recommendation: maybe she could get rid of the last label ("widow with surving partner"). Apar from being too large, it has < 1%, hence the information could be inserted just as a note.
                        Last edited by Marcos Almeida; 08 May 2017, 09:30.
                        Best regards,

                        Marcos

                        Comment


                        • #13
                          Marcos: I don't know. Categories with very small percents can still be of interest.

                          Forgive for puffing my own work, but a feature of tabplot is that you (can) see more than just a bar of negligible height. #7 is an example to hand.

                          Comment


                          • #14
                            I see your point, Nick, and I agree with it. More so when you underlined that tabplot provides "hope", I mean, a strategy to make tiny percentages visible in a graph.

                            Surely (SJ) tabplot and (SSC) catplot are great user-written programs.

                            That said, just as a side note, no matter the resource we rely on, I gather some categories are quite large in terms of wording, such as "Divorced dissolved civil union" as well as "Widowed surviving partner".

                            All in all, shall these long labels be kept, the horizontal form, as you wisely remarked, is surely the best approach.
                            Best regards,

                            Marcos

                            Comment


                            • #15
                              You can also save space by stacking axis labels, e.g. by adding to the code of #7

                              Code:
                              yla(3 "Never married" 2 `" "Married/" "civil union" "' 1 `" "Divorced/dissolved" "civil union" "')

                              Comment

                              Working...
                              X