Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to make a bar plot that breaks extremly large bars

    For example, something like this:


  • #2
    I think this defies the purpose of the bar chart making those broken bars look smaller than they are really.
    Perhaps plot your chart on the logarithmic scale?
    https://www.stata.com/statalist/arch.../msg00799.html

    Comment


    • #3
      Originally posted by Sergiy Radyakin View Post
      I think this defies the purpose of the bar chart making those broken bars look smaller than they are really.
      Perhaps plot your chart on the logarithmic scale?
      https://www.stata.com/statalist/arch.../msg00799.html
      In my data, most observations have values between 100 and 1000, but there are several observations that are above 5000. If I keep the tallest bars or use log transformed values, the plot gives much less information about the variation in the 100-1000 range. So I wish that there is a way to break bars.

      Comment


      • #4
        Perhaps spend 3 minutes reading this post:
        https://peltiertech.com/broken-y-axis-in-excel-chart/

        Comment


        • #5
          I agree with Sergiy that broken bars are a bad idea. Independently of our views it's crucial that Stata does not make scale breaks easy, and I guess that is deliberate. There is an FAQ on scale breaks at https://www.stata.com/support/faqs/g.../scale-breaks/ Some years ago I stumbled across a complaint -- perhaps on Twitter -- that the tone of this FAQ was very much "this is a bad idea; don't do it, really". My co-author Scott Merryman can speak for himself but I was delighted that the message was understood.

          I don't think bar charts march well with logarithmic scales.

          I don't know how the graph posted in #1 was produced but I note that there is a change of scale as well as a scale break.

          To push the issue further, I note that, for once, the pattern of the data in #1 would be very clear in a table as the very high values for December and January would leap out at any careful reader.

          Beyond that, I really don't see that log transformation gives less information on the pattern of lower values. That's the opposite of what a logarithmic scale does.

          For a longer series, using line rather than connected might be better. Whether axis labels for December and January would be a good idea is an interesting detail but secondary to the main theme.

          To fix ideas I read the data roughly off the graph:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(sales mdate)
             200 589
             300 590
             200 591
             300 592
             200 593
             300 594
             400 595
             450 596
            1200 597
          100000 598
           85000 599
             450 600
          end
          format %tmMon_YY mdate
          
          twoway connected sales mdate, ysc(log) yla(1e5 1e4 1e3 1e2, ang(h)) xla(`=ym(2009, 2)'(3)`=ym(2010, 2)') xtitle("")
          Click image for larger version

Name:	sales.png
Views:	1
Size:	24.1 KB
ID:	1558168

          Comment

          Working...
          X