Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stacked Area Chart

    Hi,

    I've checked the forums but come up short. My data is organized by count year and type, where type is categorical. I'm currently working with Stata to try to get stacked area graphs something like this:



    Here is my code so far:

    Code:
    graph twoway (area count year if type ==1, sort) ///
    (area count year if type ==2, sort) ///
    (area count year if type ==3, sort) ///
    (area count year if type ==4, sort) ///
    (area count year if type ==5, sort) ///
    (area count year if type ==6, sort) ///
    (area count year if type ==7, sort) ///
    (area count year if type ==8, sort)
    As you can probably see, it doesn't yet do what i want. There are two things I would like to do:

    A 'stacked' area chart with cumulative absolute values (like above) and one which is 'stacked', but percentual (a rectangle with colored shapes showing relative amounts by year).

    Please let me know if there is a way to do this with my values or if i have to create new variables with cumulative and percentual values.

    Thanks

  • #2
    No example data here, nor any use of a provided dataset, nor do you even show the graph you got: your illustration shows the kind of thing you want, not what your code gives. Please do keep reading FAQ Advice #12 -- because you are ignoring key advice, making your question less intelligible, less interesting and less likely to be answered.

    But the problem will be that Stata (precisely twoway) just superimposes those areas. No part of the code knows about the others.

    Here is a dopey example (NB: adding incommensurates is silly, but I couldn't quickly find a dataset where cumulation is sensible) to show technique.

    Code:
    webuse grunfeld, clear
    collapse (sum) mvalue invest kstock, by(year)
    
    gen sum2 = mvalue + invest
    gen sum3 = mvalue + invest + kstock
    
    twoway area mvalue year || rarea mvalue sum2 year || rarea sum2 sum3 year, ///
    legend(order(3 "kstock" 2 "invest" 1 "mvalue") pos(3) col(1)) xtitle("") ///
    xla(1935(5)1950 1954) ytitle(dopey cumulation)
    Click image for larger version

Name:	stacked_area.png
Views:	1
Size:	40.6 KB
ID:	1457204




    That said, these graphs are in my view oversold. Even when there is an interpretation of parts of a sum it's still hard to follow patterns in any variable except that plotted on the bottom. The legend (key) becomes a needed evil. In principle, the graph idea is easy to grasp. In practice, people glance at it and move on. It's too much like hard work even here to try to work out whether the patterns in the other two variables are similar or different, let alone spot details of interest.

    (By the way, I would have written a community-contributed command automating the steps above if I thought the technique deserved it....)

    Here's an alternative that in my view almost always works better. https://www.statalist.org/forums/for...ailable-on-ssc

    Code:
    * clean up collapse side-effects
    foreach v in mvalue invest kstock {
    label var `v'
    }
    
    * if not installed:
    * ssc desc multiline
    multiline mvalue invest kstock year, xtitle("") xla(1935(5)1950 1954)
    Click image for larger version

Name:	not_stacked_area.png
Views:	1
Size:	22.4 KB
ID:	1457205




    You have 8 components, not 3, but I doubt that makes anything easier.
    Last edited by Nick Cox; 08 Aug 2018, 05:58.

    Comment


    • #3
      Here is another way plotting the percent for each category using Nick's example:

      Code:
      webuse grunfeld, clear
      collapse (sum) mvalue invest kstock, by(year)
      gen percent1 = mvalue / (mvalue + invest + kstock)
      gen percent2 = (mvalue+invest)/ (mvalue + invest + kstock)
      gen percent3 = 1
      gen zero = 0 
      twoway rarea zero percent1 year /// 
          || rarea percent1 percent2 year /// 
          || rarea percent2 percent3 year  /// 
          ||, legend(order(3 "kstock" 2 "invest" 1 "mvalue")) /// 
           xla(1935(5)1950 1954) ytitle(dopey cumulation percent)
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	43.2 KB
ID:	1457217

      Comment


      • #4
        Thanks so much to both of you! I will experiment with all three styles.

        Comment


        • #5
          In #2 the cumulated areas should start at 0. It's yet another advantage of line graphs that no such rule applies.

          Comment

          Working...
          X