Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stacked bar graph (stacking by group, rather than all)

    Hello I have the following data,

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year float(emp1 emp2 emp3 emp4)
    2000  8.287  349.506   62.41  456.367
    2001  7.862  427.525  69.546  598.399
    2002 16.165  611.199  75.528  744.512
    2003 11.914  777.101  87.031  897.903
    2004 18.107  944.163  83.388 1076.389
    2005 22.732 1100.708 115.292 1210.571
    2006 20.412 1316.654 134.195 1365.223
    2007 31.985 1526.965 168.284 1547.406
    2008 25.674   1658.5 154.205 1680.176
    2009 21.161 1739.472 140.922   1790.8
    2010 28.375 1957.499 217.189 1889.972
    2011 43.873 2267.242 167.622 2083.245
    2012 32.195 2429.847 146.445 2084.692
    2013 39.729  2697.61 124.509 2163.135
    2014 32.863 3020.331 147.221 2248.292
    2015 27.374 3376.271 151.724 2393.775
    2016 39.595 3708.233 108.538 2611.653
    end

    I'm creating the following bar graph:

    Code:
    * draw graph for first decomposition (continuers & exiters)
    graph bar emp1 emp2 emp3 emp4, over(year,label(labsize(small) angle(45)) gap(20)) stack ///
        bar(1, color(navy*0.8)) bar(2, color(navy)) bar(3, color(green)) bar(4, color(dkgreen)) graphregion(color(white)) ///
        ytitle("Manufacturing Employment (k workers)") ///
        legend(on label(1 "Foreign exiters") label(2 "Foreign continuers") ///
        label(3 "Private exiters") label(4 "Private continuers") ///
        region(lstyle(foreground)) rows(1) position(6) span) outergap(10)
    I want to stack some data, hence why I included the stack option, however I only want to stack 'emp1' and 'emp2' together and seperately 'emp3' and 'emp4' together. So two stacked bars in each year. Not all four of them stacked together as currently, which is producing only one stacked bar in each year.
    Any ideas how to implement this?

    Thanks,
    Jad
    Last edited by Jad Tamimi; 18 Jun 2024, 21:39.

  • #2
    Also I want to keep the current dataset format, so I want 4 different series for 4 different legends, really I want my graph to look exactly like this version here:

    Code:
    graph bar emp1 emp3 , over(year,label(labsize(small) angle(45)) gap(20)) ///
    bar(1, color(navy)) bar(2, color(dkgreen)) graphregion(color(white)) ///
    ytitle("Manufacturing Employment (k workers)") ///
    legend(on label(1 "Foreign exiters") ///
    label(2 "Private exiters") ///
    region(lstyle(foreground)) rows(1) position(6) span) outergap(10)
    With the only difference being I now have an additional bar stacked onto each of those two and I also have 4 legend options instead of two.

    Please let me know if its possible to do this on Stata.

    Thanks,
    Jad
    Last edited by Jad Tamimi; 18 Jun 2024, 22:29. Reason: forgot the code option

    Comment


    • #3
      I wouldn't insist on a stacked design here, especially since it appears that you are stacking emp1 and emp3 as well as emp2 and emp4, and these variables have very different magnitudes. In stacked designs, small quantities are difficult to see. See #3 by Nick Cox for drawbacks relating to a stacked design: https://www.statalist.org/forums/for...cked-bar-chart. For an alternative, see tabplot from the Stata Journal.

      Code:
      search tablot
      In terms of how you create a stacked bar, you need to switch to twoway bar. Here is an example that stacks emp1 and emp2 and emp3 and emp4, unlike your example.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int year float(emp1 emp2 emp3 emp4)
      2000  8.287  349.506   62.41  456.367
      2001  7.862  427.525  69.546  598.399
      2002 16.165  611.199  75.528  744.512
      2003 11.914  777.101  87.031  897.903
      2004 18.107  944.163  83.388 1076.389
      2005 22.732 1100.708 115.292 1210.571
      2006 20.412 1316.654 134.195 1365.223
      2007 31.985 1526.965 168.284 1547.406
      2008 25.674   1658.5 154.205 1680.176
      2009 21.161 1739.472 140.922   1790.8
      2010 28.375 1957.499 217.189 1889.972
      2011 43.873 2267.242 167.622 2083.245
      2012 32.195 2429.847 146.445 2084.692
      2013 39.729  2697.61 124.509 2163.135
      2014 32.863 3020.331 147.221 2248.292
      2015 27.374 3376.271 151.724 2393.775
      2016 39.595 3708.233 108.538 2611.653
      end
      
      
      
      gen x1= year-2000-.25
      gen x2= year-2000+.25
      gen emp2s = emp1+emp2
      gen emp4s= emp3+emp4
      twoway (bar emp2s emp1 x1, barwidth(0.4 ..)) ///
       (bar emp4s emp3 x2, barwidth(0.4 ..)), ///
       xtitle(2000s) xlab(0/16, format(%02.0f)noticks) ///
       leg(order(2 "emp1" 1 "emp2" 4 "emp3" 3 "emp4")) ///
       plotregion(margin(zero))
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	47.7 KB
ID:	1756567

      Comment


      • #4
        A better graph might follow from knowing what emp1 emp2 emp3 emp4 are. (Surely any readers of this graph need to be told?)

        For example emp1 + emp2 might be male employment and emp3 + emp4 might be female employment. In that case a more complicated display but one closer to what is wanted would compare the two totals

        Code:
        emp1 + emp2
        
        emp3 + emp4
        and then compare also

        emp1 as a fraction of its total

        emp3 as a fraction of its total.

        That would be best done in two aligned line charts.

        As Andrew Musau underlines a key limitation of stacked bar charts is that small amounts are hard to see. Also, it's often a toss-up on whether we want to see absolute or relative amounts.


        #1 and #2 would have been clearer if the graphs had been shown as ..png attachments.

        Comment


        • #5
          Here is the idea of #4 developed.

          #4 was unfair insofar as some explanation in terms exiters and continuers is included in the code.

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input int year float(emp1 emp2 emp3 emp4)
          2000  8.287  349.506   62.41  456.367
          2001  7.862  427.525  69.546  598.399
          2002 16.165  611.199  75.528  744.512
          2003 11.914  777.101  87.031  897.903
          2004 18.107  944.163  83.388 1076.389
          2005 22.732 1100.708 115.292 1210.571
          2006 20.412 1316.654 134.195 1365.223
          2007 31.985 1526.965 168.284 1547.406
          2008 25.674   1658.5 154.205 1680.176
          2009 21.161 1739.472 140.922   1790.8
          2010 28.375 1957.499 217.189 1889.972
          2011 43.873 2267.242 167.622 2083.245
          2012 32.195 2429.847 146.445 2084.692
          2013 39.729  2697.61 124.509 2163.135
          2014 32.863 3020.331 147.221 2248.292
          2015 27.374 3376.271 151.724 2393.775
          2016 39.595 3708.233 108.538 2611.653
          end
          
          gen toshow1 = emp1 + emp2
          gen toshow2 = emp3 + emp4
          gen pc1 = 100 * emp1 / toshow1
          gen pc2 = 100 * emp3 / toshow2
          
          set scheme s1color
          
          line toshow? year, lc(blue red) name(G1, replace) xsc(off) fysize(60) ///
          || scatteri 3750 2013 "emp1 + emp2", ms(none) mlabcolor(blue) mlabsize(medlarge) ///
          || scatteri 2200 2013 "emp3 + emp4", ms(none) mlabcolor(red) mlabsize(medlarge) ///
          legend(off)
          
          line pc? year, lc(blue red) name(G2, replace) fysize(40) ///
          || scatteri 7.5  2013 "emp1 as %", ms(none) mlabcolor(red) mlabsize(medlarge) ///
          || scatteri 2.5 2013 "emp3 as %", ms(none) mlabcolor(blue) mlabsize(medlarge) ///
          legend(off) xla(2000(4)2016)
          
          graph combine G1 G2, col(1) xcommon imargin(vsmall)
          Click image for larger version

Name:	two_panel.png
Views:	1
Size:	34.0 KB
ID:	1756582

          Last edited by Nick Cox; 19 Jun 2024, 06:06.

          Comment


          • #6
            Thanks! I decided to opt for this option:

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input int year float(emp1 emp2 emp3 emp4)
            2000  8.287  349.506   62.41  456.367
            2001  7.862  427.525  69.546  598.399
            2002 16.165  611.199  75.528  744.512
            2003 11.914  777.101  87.031  897.903
            2004 18.107  944.163  83.388 1076.389
            2005 22.732 1100.708 115.292 1210.571
            2006 20.412 1316.654 134.195 1365.223
            2007 31.985 1526.965 168.284 1547.406
            2008 25.674   1658.5 154.205 1680.176
            2009 21.161 1739.472 140.922   1790.8
            2010 28.375 1957.499 217.189 1889.972
            2011 43.873 2267.242 167.622 2083.245
            2012 32.195 2429.847 146.445 2084.692
            2013 39.729  2697.61 124.509 2163.135
            2014 32.863 3020.331 147.221 2248.292
            2015 27.374 3376.271 151.724 2393.775
            2016 39.595 3708.233 108.538 2611.653
            end
            
            gen x1 = year-0.225
            gen x2 = year+0.225
            gen emp2_ = emp1+emp2 
            gen emp4_= emp3+emp4 
            
            twoway (bar emp2_ emp1 x1, barwidth(0.45 ..) color(dknavy navy) lwidth(none none)) ///
             (bar emp4_ emp3 x2, barwidth(0.45 ..) bcolor(dkgreen green) lwidth(none none)), plotregion(margin(zero)) ///
             xlabel(2000(1)2016, labsize(small) angle(45) noticks labgap(tiny) nogrid) ylabel(, angle(vertical) format(%9.0gc) gstyle(solid)) ///
             ysc(r(0 4175)) xsc(r(1999 2017)) ytitle("Manufacturing Employment (k workers)") ///
             legend(on order(2 "Foreign exiters" 1 "Foreign continuers" 4 "Private exiters" 3 "Private continuers") ///
            region(lstyle(foreground)) rows(1) position(6) span)
            Last edited by Jad Tamimi; 19 Jun 2024, 13:15. Reason: Wrong code

            Comment


            • #7
              I feel as though there should be easier ways to make this bar graph.

              Comment


              • #8
                it’s a common feeling: all I want is a relatively simple bar chart!

                It is easy enough to look at the graph and understand the principle. What’s biting is that Stata still needs to be told exactly what to put where.

                You could call this anything you like — say a twinned double stacked bar chart — and write a dedicated command but it’s hard for me to imagine that the syntax would be much simpler than the direct syntax.

                i don’t see that the stacked design can be used to read anything but gross trends.

                Comment


                • #9
                  Click image for larger version

Name:	exiters.png
Views:	1
Size:	55.7 KB
ID:	1756647

                  Here's a tabplot solution as mentioned by Andrew Musau

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input int year float(emp1 emp2 emp3 emp4)
                  2000  8.287  349.506   62.41  456.367
                  2001  7.862  427.525  69.546  598.399
                  2002 16.165  611.199  75.528  744.512
                  2003 11.914  777.101  87.031  897.903
                  2004 18.107  944.163  83.388 1076.389
                  2005 22.732 1100.708 115.292 1210.571
                  2006 20.412 1316.654 134.195 1365.223
                  2007 31.985 1526.965 168.284 1547.406
                  2008 25.674   1658.5 154.205 1680.176
                  2009 21.161 1739.472 140.922   1790.8
                  2010 28.375 1957.499 217.189 1889.972
                  2011 43.873 2267.242 167.622 2083.245
                  2012 32.195 2429.847 146.445 2084.692
                  2013 39.729  2697.61 124.509 2163.135
                  2014 32.863 3020.331 147.221 2248.292
                  2015 27.374 3376.271 151.724 2393.775
                  2016 39.595 3708.233 108.538 2611.653
                  end
                  
                  
                  reshape long emp, i(year) j(which)
                  label def which 1 "Foreign exiters" 2 "Foreign continuers" 3 "Private exiters" 4 "Private continuers"
                  label val which which
                  
                  forval y = 2000/2016 {
                      local this : di %tyYY `y'
                      label def year `y' "`this'", add
                  }
                  label val year year
                  label var year "year 20.."
                  
                  tabplot which year [iw=emp], showval(format(%1.0f)) barw(0.8) separate(which) subtitle("000") ytitle("")

                  Comment


                  • #10
                    Anyone bemused by the minor trickery to get year 2000 to 2016 showing as 00 to 16 -- using value labels defined for the purpose -- may be reassured to learn that there is another way to do it.

                    Code:
                    label var year "year 20.."
                    
                    tabplot which year [iw=emp], xasis xla(, format(%tyYY)) showval(format(%1.0f)) barw(0.8) separate(which) subtitle("000") ytitle("")

                    Comment

                    Working...
                    X