Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Y-axis break in xtlineplot (or twoway plots in general)

    I have dataset genenerated with the statsby-function. Thereafter the dataset has been set as a time series and I have drawn the following graph:


    Code:
    tsset panelvar timevar
    
    xtline var1, overlay ///
    legend(off) ///
    ytitle("(%)") xtitle("") ///
    xlab(minmax) xlabel(1950(10)2005) ///
    ysc(r(0/10, 90/100))



    I would like to omit the y-axis gap between 10%-90% in a similiar fashion with "axis ranges" command in SAS. (See for example:
    https://blogs.sas.com/content/graphi...earance-macro/
    http://support.sas.com/kb/24/909.html (check the results tab))

    Now, what I have tried here with...
    Code:
    ysc(r(0/10, 90/100))
    ...obviously does not work, but I hope you get the idea that I want a range from 0 to 10 and 90 to 100, and having a gap between 10 and 100.

    Is there any way to do this with xtline or twoway plots in general or should I transform the data in one way or another?

    All help is greatly appreciated.

    Br,
    Samuel


  • #2
    There is guidance on this at https://www.stata.com/support/faqs/g.../scale-breaks/

    Comment


    • #3
      Thanks Nick, once again!

      Comment


      • #4
        Hi All

        I recently found a way to "break" the y-axis of a bar chart so that it has a hard maximum even if your data includes a point beyond this maximum. I use a tall bar to indicate that the bar extends beyond the top of the figure (see attached).

        I found I could not do this as I wanted using the menu or commands, so I had to resort to the graph editor. What I did was first topcode the high value from 393, the real value, to 160 (arbitrarily chosen to be closer to the rest of my observations). I then created the bar chart, and in the graph editor I unlocked the movement of the high bar, and manually dragged it upwards to reach the very top of the axis. Again, I could not find a way to do this with yscale or associated options. But it worked with the graph editor. I then simply added a line and text box label to indicate that the bar is "off the charts". It is a slightly hacky process but the result is quite nice. See attached if interested.
        Click image for larger version

Name:	bar - by inst.png
Views:	1
Size:	172.9 KB
ID:	1623264

        Comment


        • #5
          I wouldn't truncate the scale here. What do you gain? The numbers are all readable. If you change the scale for the very highest value, you need to explain that, and at least some readers will be confused or indignant at your truncating the data. There is no win-win solution here as the same would apply to use of log scale, with different people being confused or indignant perhaps. (if you did use log scale, I would recommend a dot chart, not a bar chart.)

          I would change the axes. The institution names would be much easier to read if presented horizontally.

          The FAQ cited in #2 is community-contributed but StataCorp wouldn't host it if they disagreed strongly. I can remember drawing in scale breaks manually in the 1960s but they are not supported in any software I know. How far that is for good reasons is debatable.

          If you posted the data others could experiment and show what they would do.

          Comment


          • #6
            Hi Nick

            I read the FAQ you mention with great interest and found it useful, but unfortunately I couldn't get it right to apply that same logic to the bar graph case. I have used dataex for the first time to generate the data for you. I will paste it below.

            To answer your question: the reason I did this (what is gained) is that without a change of scale the largest bar (University of Cape Town, UCT) is so tall the others all become dwarfed. With truncation the scale is automatically made to suit the other data points, making them easy to compare. Another factor to consider is that in this case UCT is an somewhat of an exception to the rest, as we host the NIDS-CRAM dataset, so naturally via word of mouth it gets a lot more attention from UCT than the other institutions. So there is motivation to separate it out from the rest. Just to be clear in case there is a misunderstanding - in the image I attached before the scale had already been truncated.

            The concerns you raised about the log scale were exactly why I avoided it - the audience in question wouldn't understand log transformations. This is also a motivation for using the humble yet intuitive bar chart.

            You point about switching the axes is well taken - I didn't think of that and I am curious to try it.

            Regarding the FAQ: I posted on this tread just so my solution exists somewhere for whoever might be looking - I don't mind if it is added to the FAQ or not (although I would consider that a feather in my cap!).

            Data without truncation pasted below:
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str78 Institution int Institution_count
            "University of Cape Town"                393
            "University of the Witwatersrand"         79
            "Stellenbosch University"                 72
            "University of Johannesburg"              34
            "University of KwaZulu Natal"             31
            "Human Sciences Research Council"         20
            "University of Pretoria"                  13
            "International Monetary Fund"             12
            "University of the Free State"            12
            "Nelson Mandela University Metropolitan"  10
            "University of the Western Cape"           9
            "North-West University"                    8
            "University of Mannheim"                   7
            "University of Fort Hare"                  7
            "South African Reserve Bank"               7
            "Statistics South Africa"                  6
            "UNU-WIDER"                                6
            "University of Oxford"                     6
            "University of Namibia"                    5
            "Red Cross"                                5
            "Xiamen University"                        5
            "Consultant"                               5
            "Helen Suzman Foundation"                  5
            "FTI Consulting"                           5
            "University of Maryland"                   5
            "Harambee"                                 5
            "University of Southern California"        4
            "University of Washington"                 4
            "University of Ibadan"                     4
            "University of Birmingham"                 4
            "University of Nigeria"                    4
            "City of Cape Town"                        4
            "Loyola University"                        4
            "University of South Africa"               4
            end
            Regards,
            Bruce

            Comment


            • #7
              Thanks for the data example. A principle of long standing is that graph bar is very flexible, but if and when it won't do all that you want, it's a case of reculer pour mieux sauter: use twoway bar instead. You then need to reinvent some of what the first command does for you, but that's a matter of standard options.

              Code:
               
              
              
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input str78 Institution int Institution_count
              "University of Cape Town"                393
              "University of the Witwatersrand"         79
              "Stellenbosch University"                 72
              "University of Johannesburg"              34
              "University of KwaZulu Natal"             31
              "Human Sciences Research Council"         20
              "University of Pretoria"                  13
              "International Monetary Fund"             12
              "University of the Free State"            12
              "Nelson Mandela University Metropolitan"  10
              "University of the Western Cape"           9
              "North-West University"                    8
              "University of Mannheim"                   7
              "University of Fort Hare"                  7
              "South African Reserve Bank"               7
              "Statistics South Africa"                  6
              "UNU-WIDER"                                6
              "University of Oxford"                     6
              "University of Namibia"                    5
              "Red Cross"                                5
              "Xiamen University"                        5
              "Consultant"                               5
              "Helen Suzman Foundation"                  5
              "FTI Consulting"                           5
              "University of Maryland"                   5
              "Harambee"                                 5
              "University of Southern California"        4
              "University of Washington"                 4
              "University of Ibadan"                     4
              "University of Birmingham"                 4
              "University of Nigeria"                    4
              "City of Cape Town"                        4
              "Loyola University"                        4
              "University of South Africa"               4
              end
              
              gen Inst = subinstr(Institution, "University", "U", .) 
              
              gen order = _n
              
              * package gr0034 from http://www.stata-journal.com/software/sj8-2
              labmask order, values(Inst)
              
              gen toshow = min(Institution_count, 160)
              gen zero = 0 
              
              twoway bar toshow order, horizontal yla(1/34, ang(h) valuelabel noticks) base(0) barw(0.8) ///
              || scatter order zero, ms(none) mla(Institution_count) mlabpos(9) legend(off) xsc(r(-30, .)) ytitle("") ysize(7) ysc(reverse) ///
              || scatteri 1 150 "+", ms(none) xtitle(Downloads) xsc(alt)
              Click image for larger version

Name:	bruce.png
Views:	1
Size:	40.5 KB
ID:	1623361


              It's perhaps redundant for anyone studying the graph and code carefully, but I will spell out various small devices.

              1. For readability of long explanatory text you and your readers will find interesting, go horizontal. Elsewhere I introduced the term giraffe graphics as derogatory for anything that obliges readers to turn their heads to read something. Hint: Often they won't bother.

              2. I detest cryptic abbreviations but I am guessing that everyone will find U clear enough for University. If everyone knows (e.g.) HSRC as standard in South Africa then you can gain more space.

              3. You need to create the y axis variable as the observation number and then define its value labels to use as axis labels. labmask was written to make that easier. The good folks at UCLA wrote an FAQ about this at https://stats.idre.ucla.edu/stata/fa...ther-variable/ which is fine by me, although I prefer to cite https://www.stata-journal.com/articl...article=gr0034 which may add useful context and extra tips in the same territory.

              4. To truncate the highest bar there is absolutely no need to fool around in the Graph Editor. I would feel a strong need, as you did, to flag on the graph whenever a bar was truncated. One device is shown in the code and the graph; another would be to use an arrow; regardless of the choice I would add text on a presentation slide or text in a graph caption in a paper, report, or thesis.

              5. I think the table-like flavour is improved by putting numbers on the left of (at the base of) each bar.

              6. Also, it's a matter of taste only, but for table-like graphs the x-axis stuff can go better at the top. More at https://www.stata-journal.com/articl...article=gr0053

              7, You'll want to add an overall title. In #4 beware the typo "Dowloads" if you prefer your original after all this.

              Comment

              Working...
              X