Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it possible to overlay a stacked bar plot and a line plot?

    I have two figures that I want to overlay in a single figure. The first is created using
    Code:
    graph bar x1 x2 x3, over(year, gap(5)) stack
    . The second is created using
    Code:
    connected x4 year
    . Is there a way to force them to show up together in a single figure?

  • #2
    No. You need something more like


    Code:
    gen x1x2 = x1 + x2 
    gen x1x2x3 = x1x2 + x3 
    
    twoway bar x1 year, barw(0.9) || rbar x1 x1x2 year, barw(0.9) || rbar x1x2 x1x2x3 year, barw(0.9) || connected x4 year

    Comment


    • #3
      Originally posted by Nick Cox View Post
      No. You need something more like


      Code:
      gen x1x2 = x1 + x2
      gen x1x2x3 = x1x2 + x3
      
      twoway bar x1 year, barw(0.9) || rbar x1 x1x2 year, barw(0.9) || rbar x1x2 x1x2x3 year, barw(0.9) || connected x4 year
      This works well. Thank you!

      Comment


      • #4
        What about if x1 and x2 can switch between negative and positive values in different years? The above solution only works if all values are positive/negative.

        Comment


        • #5
          If your variables can be negative as well as positive, a stacked bar chart doesn't seem a good idea any way.

          Comment


          • #6
            This is why the connect overlay is required, as it shows the sum of the positives and negatives. Imagine the variables are contributors to inflation rate over time and that some contributors can become negative in certain periods. I want to show the individual components stacked on top of each other, with the aggregate inflation as an overlay to this.

            Comment


            • #7
              This may help in the absence of a data example. The idea is to stack bar segments below 0 if negative and above 0 otherwise. That obliges keeping track of the sums of positives and negatives so far.

              Even if you like it, I would do some field testing of whether people can follow it. Other way round, if you are under instruction to do this, then that settles the matter.

              Code:
              clear 
              set obs 10 
              set seed 314159
              gen y = 2012 + _n
              
              gen x1 = 0.5 +  runiformint(1, 6)
              gen x2 = 0.5 +  runiformint(-1, 4)
              gen x3 = 0.5 + runiformint(-3, 2)
              
              gen top = 0 
              gen bottom = 0 
              
              forval j = 1/3 { 
                  
                  gen base`j' = cond(x`j' >=  0, top, bottom)
                  gen show`j' = base`j' + x`j'
                  
                  replace top = top + x`j' if x`j' >= 0 
                  replace bottom = bottom + x`j' if x`j' < 0 
              }
              
              gen total = x1 + x2 + x3 
              local opts barw(0.9)
              twoway rbar base1 show1 y, `opts' || rbar base2 show2 y, `opts' || rbar base3 show3 y, `opts' || line total y ,  xtitle("") xla(2013/2022) lw(thick) lc(black) legend(order(1 "x1" 2 "x2" 3 "x3")) ytitle(whatever)
              Click image for larger version

Name:	oddstack.png
Views:	1
Size:	38.2 KB
ID:	1718786




              Comment


              • #8
                Awesome! That's perfect. Thanks so much

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  No. You need something more like


                  Code:
                  gen x1x2 = x1 + x2
                  gen x1x2x3 = x1x2 + x3
                  
                  twoway bar x1 year, barw(0.9) || rbar x1 x1x2 year, barw(0.9) || rbar x1x2 x1x2x3 year, barw(0.9) || connected x4 year
                  Hi Nick Nick Cox , would you please help me with this graph error?

                  I have 4 categorical variables (0/1) and 1 continuous variable. I would like to get a stack bar graphs with those 4 variables over the years and also overlay the line graph with that continuous variable.

                  my code:

                  gen fshdfsld = FSHD + FSLD
                  gen fshdfsldfihd = fshdfsld + FIHD
                  gen fshdfsldfihdfild = fshdfsldfihd + FILD

                  twoway bar FSHD year, barw(0.9) ///
                  || rbar FSHD fshdfsld year, barw(0.9) ///
                  || rbar fshdfsld fshdfsldfihd year, barw(0.9) ///
                  || rbar fshdfsldfihd fshdfsldfihdfild year , barw(0.9) ///
                  || line _margin_n year , b1title("Year") ///
                  legend(label(1 "FSHD") label(2 "FSLD") ///
                  label(3 "FIHD") label(4 "FILD")) legend(size(small)) ///
                  ytitle(Percent) ylabel(0 "0" .2 "20%" .4 "40%" .6 "60%" .8 "80%" 1 "100%", ///
                  noticks nogrid angle(0) labsize(small)) ylab(0(10)100)

                  but i'm getting very weird one:
                  Attached Files
                  Last edited by Sanchita Chakrovorty; 03 May 2024, 03:54.

                  Comment


                  • #10
                    There is no data example in #9 but on the evidence you give the result doesn't surprise me. The stacked bars are sums of 0s and 1s while it's clear that _margin_n has values close to 70. The error is plotting quantities of quite different magnitudes; it's not in your syntax as Stata produced a graph as you instructed.

                    Please show us a data example to get constructive advice. Show us the results of

                    Code:
                    dataex FSHD FSLD FIHD FILD _margin_n year
                    copied and pasted to here.

                    A different improvement would be to specify xlabel(2007(2)2019). xtitle() and b1title() convey the same information; in fact no one interested in your results will need to be told that 2007 to 2019 are years. So just omit the call to b1title() and zap the axis title with xtitle("")

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      There is no data example in #9 but on the evidence you give the result doesn't surprise me. The stacked bars are sums of 0s and 1s while it's clear that _margin_n has values close to 70. The error is plotting quantities of quite different magnitudes; it's not in your syntax as Stata produced a graph as you instructed.

                      Please show us a data example to get constructive advice. Show us the results of

                      Code:
                      dataex FSHD FSLD FIHD FILD _margin_n year
                      copied and pasted to here.

                      A different improvement would be to specify xlabel(2007(2)2019). xtitle() and b1title() convey the same information; in fact no one interested in your results will need to be told that 2007 to 2019 are years. So just omit the call to b1title() and zap the axis title with xtitle("")
                      hi Nick, thank you. here is my data, and code i used. The output i am getting is not stacking. May be im doing something wrong. I am looking to get like the 2nd figure but a line is overlaid on it. Would you please help me getting the desired one?


                      Code:
                      gen fshdfsld = (FSHD + FSLD)
                      gen fshdfsldfihd = fshdfsld + FIHD
                      gen fshdfsldfihdfild = fshdfsldfihd + FILD
                      gen _margin_n= _margin/100
                      
                      twoway bar FSHD year, barw(0.9) ///
                              || rbar FSHD fshdfsld year, barw(0.9) ///
                              || rbar fshdfsld fshdfsldfihd year, barw(0.9) ///
                              || rbar fshdfsldfihd fshdfsldfihdfild year , barw(0.9) ///
                              || line _margin_n year  ,  ///
                              legend(label(1 "FSHD") label(2 "FSLD") ///
                          label(3 "FIHD") label(4 "FILD")) legend(size(small)) ///
                          ytitle(Percent) ylabel(0 "0" .2 "20%" .4  "40%" .6 "60%" .8 "80%" 1 "100%", ///
                          noticks nogrid angle(0) ///
                          labsize(small))  ylab(0(10)100) xla(2007(2)2018) //b1title("Year")
                      
                      
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input float(FSHD FSLD FIHD FILD _margin_n) double year
                      0 0 0 1 .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      1 0 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 0 0 1 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 0 1 0 .6649266 2007
                      1 0 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 0 1 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 0 0 1 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      0 0 1 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 0 0 1 .6649266 2007
                      0 0 1 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      1 0 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 0 1 0 .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      0 1 0 0 .6649266 2007
                      . . . . .6649266 2007
                      1 0 0 0 .6649266 2007
                      0 0 0 1 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      0 1 0 0 .6649266 2007
                      end

                      Attached Files

                      Comment


                      • #12
                        Thanks for the detail, but unfortunately this doesn't help me much,

                        It looks as if _margin_n is constant for each year, so a line plot is easy for that.

                        Correction to #10 _margin_n has values close to 0.7; it's just that axis labels show percents. That is in the code. But that correction makes it more mysterious why the bars hardly show up.

                        But you have multiple observations for each year. There are not distinct (unique) combinations of your variables FSHD FSLD FIHD FILD for each year in practice.

                        Adding (stacking) your four indicator variables in principle produces values between 0 and 4. In your data example the total is either 0 or 1.

                        Also any addition 0 + 0 + 0 + 0 yields invisible bars, which are no use.

                        Perhaps you're imagining that twoway bar somehow counts values for you, but no; it just plots the values you feed it.

                        You have in principle (ignoring missing) 16 possible combinations of FSHD FSLD FIHD FILD if they are (0, 1) indicators.

                        So you might make progress with

                        Code:
                        egen Fcomb = group(FSHD FSLD FIHD FILD), label
                        and plotting the frequencies of each combination of interest separately.

                        There's some economics, or some other substantive question, behind this. I don't see that further advice is easy without much more explanation.
                        Last edited by Nick Cox; 04 May 2024, 05:21.

                        Comment


                        • #13
                          No more information received, but I will try once more. You can get variables all on a proportion scale with


                          Code:
                          foreach v in FSHD FSLD FIHD FILD { 
                              egen `v'_mean = mean(`v'), by(year) 
                          }
                          and then call up a line plot

                          Code:
                          line *mean _margin_n year,  xlabel(2007(2)2019) xtitle("")
                          I don't get a sense that stacked bars are a good idea, but then again I don't know what these variables are.

                          Comment

                          Working...
                          X