Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graph bar over year: how to shorten displayed year labels?

    I want to create a bar plot using
    Code:
    graph bar y1 y2, over(year)
    . The four-digit year lables overlap with each other in the plot. Is there a way to convert them to 2-digit lables (e.g., 2015 to "15")? Usually I use
    Code:
    xlabel(2000 "00" 2001 "01" 2002 "02" 2003 "03" 2004 "04" 2005 "05" ///
    2006 "06" 2007 "07" 2008 "08" 2009 "09" 2010 "10" 2011 "11" 2012 "12" 2013 "13" ///
    2014 "14" 2015 "15" 2016 "16" 2017 "17" 2018 "18" 2019 "19")
    but it does not work in this case.


    Click image for larger version

Name:	year.PNG
Views:	1
Size:	9.4 KB
ID:	1565228

  • #2
    I would move to twoway bar with that many bars. Then you don't need a bar label for every bar.

    Here is a silly example:

    Code:
    webuse grunfeld , clear
    gen yearL = year - 0.1
    gen yearR = year + 0.1
    twoway bar invest yearL if company == 1, barw(0.2) color(red) || bar kstock yearR if company == 1, barw(0.2) color(blue) xla(1935(2)1955) legend(pos(11) col(1) ring(0))
    Note that with
    graph bar there is no x axis. If you wanted to stay with graph bar you would need to do something more like this.

    Code:
    local call
    forval j = 1/20 {
          local show = `j' + 34
          local call `call' `j' "`show'"
    }
    
    graph bar (asis) invest kstock if company == 1, over(year, relabel(`call')) bar(1, color(red)) bar(2, color(blue))
    Code:
    
    
    In this case the loop sets up a mapping so that bars 1 to 20 show 35 (for 1935) to 54 (for 1954).

    Your loop would go more like

    Code:
    local call
    forval j = 1/20 {
          local show : di %02.0f  (1999  + `j' - 2000) 
          local call `call' `j' "`show'"
    }
    Code:
      
     graph bar y1 y2, over(year, relabel(`call')
    See also https://www.stata-journal.com/sjpdf....iclenum=pr0051

    Comment


    • #3
      The calculation 1999 - 2000 can be simplified!

      Comment


      • #4
        Thank you! This works.

        Comment


        • #5
          Assuming you do want to label every bar, as a prompt that the data are annual, and if the year values are values, it's easy to define a label for each value and adjust the size if necessary.
          I use a style that the start and end years get full id but intermediates are abbrieviated. You could also use relabel for a one-off graph. There also the alt option that might put odd and even years on separate rows of labels.
          .
          label define year 2000 "2000" 2001 "01" 2002 "02" etc
          label values year year
          graph bar y1 y2, over(year, labsize(*.8)) // may be all you need.

          Comment


          • #6
            Let's generalise this mildly.

            The general problem I take to be. I am using graph bar but my categorical axis labels are a mess. They overlap and I need shorter labels. Or (sometimes) I don't need all the labels.

            The example in this thread of a bar chart for time series raises just about all the generic issues that arise and also specific issues for time series that arise commonly.

            Also, bars side by side for two or more variables are often wanted, and one of the attractions of graph bar, so we will stick with that.

            Let's just change the dataset to example data that everyone can access. To set side-issues out of the way, let's say that I have decided in advance that I want red and blue bars and I have decided from earlier versions of this where I want the legend to go. Also I set my own default scheme.

            Code:
            webuse grunfeld, clear
            
            set scheme s1color
            local opts1 bar(1, color(red)) bar(2, color(blue))
            local opts2 legend(pos(11) ring(0) col(1))

            Code:
            graph bar (asis) invest kstock if company == 1, over(year) `opts1' `opts2' name(G0, replace) subtitle(0, size(medlarge) placement(w))
            Graph 0 is bad.

            Thus the problem is immediate. There are only 20 distinct years here -- many time series have many more -- but that is enough to cause a mess.

            The explanation is that graph bar doesn't know or care that your data are time series. It just regards the over() variable as categorical. The way that categorical variable sorts -- so that year values are ordered from 1935 to 1954 -- is exactly what you want, but the default that every category is matched by an explicit text label isn't always what you want for time series.

            Click image for larger version

Name:	barlabels_G0.png
Views:	1
Size:	25.2 KB
ID:	1565415



            Possible solution: Use graph hbar instead.

            Very often, the answer is just: go horizontal. Many of the (vertical) bar charts I see (some people insist on calling such charts column charts) would be better off horizontal, giving space for longer text labels to be readable, and avoiding horrible solutions such as vertical labels, labels on a slant, over-abbreviated labels, or labels in a tiny font size.

            When this is the answer, good, and you bail out here.

            For time series, this is not an acceptable answer. There is a strong convention across many fields that time belongs on the horizontal axis. Put your time variable on a vertical axis and your boss or someone reviewing or grading your work is going to squawk.

            Possible solution: Use format instead.

            A user knowing a little about Stata's way with date variables may think of just using an abbreviated display format. This shows how to show the last two digits of the year

            Code:
            . display %tyYY 1935
            35


            However, my experiments with assigning such a format to the year variable failed. I think this should work, but it doesn't at present.

            Possible solution: Use (shorter) value labels instead.

            This was suggested by Allan Reese in #5. As he points out, you are in charge and aren't obliged to use the same recipe throughout. Typing out definitions for 20 value labels is less fun than writing a loop to write code for your later
            label command. .

            Code:
            local call
            forval j = 1/20 {
                  local show = `j' + 34
                  local year = 19`show'
                  local call `call' `year' "`show'"
            }
            
            label def year `call'
            label val year year
            
            label list year
            
            graph bar (asis) invest kstock if company == 1, over(year) `opts1' `opts2' name(G1, replace) subtitle(1, size(medlarge) placement(w))
            The initial statement
            Code:
            local call
            is worth flagging. I blank out any contents of the local macro call that exist (equivalently, I erase, drop or delete the macro). That way, I don't get bitten because of whatever is left behind from my previous attempt at this.

            Graph 1 is better and you might want to stop here.


            What we did was set up value labels such as "35" for 1935 and then graph bar didn't need to be told to look for and use value labels when they exist. Its expectation, as said, is that the over() variable is categorical (in the statistical sense), so the user very likely has value labels defined.

            So far so good, but even for this example you might think "too many labels!". Or your real data might be a time series with many more values, so you really do think "too many labels!".

            Note that although you can set some of your value labels to spaces or even exotic characters such as char(160) or uchar(160) such value labels are not honoured by graph bar. So we need some other device.

            Possible solution: use
            relabel().

            The help tells you about this suboption. You can just spell out what text you want. That does allow blanking out in a strong enough sense. Here bars are numbered 1 to 20. I write code for the suboption relabel() such that odd-numbered bars have text a two-digit year and even numbered bars have a text a space (which you shouldn't notice except as resembling no text) Detail: an empty string won't work: the command won't believe that you don't want a label at all.

            Code:
            local call
            forval j = 1/20 {
                  local show = `j' + 34
                  if mod(`j', 2) local call `call' `j' "`show'"
                  else local call `call' `j' " "
            }
            
            graph bar (asis) invest kstock if company == 1, over(year, relabel(`call')) `opts1' `opts2'  name(G2, replace) subtitle(2, size(medlarge) placement(w))
            Click image for larger version

Name:	barlabels_G2.png
Views:	1
Size:	21.8 KB
ID:	1565414


            In fact we have enough space to show 1935 1937 to 1953. Let's clear the value labels out of the way and just show spaces instead of the even-numbered years.


            Code:
            label val year
            local call
            forval j = 2(2)20 {
                  local call `call' `j' " "
                  
            }
            
            
            graph bar (asis) invest kstock if company == 1, over(year, relabel(`call')) `opts1' `opts2' name(G3, replace) subtitle(3, size(medlarge) placement(w))
            Click image for larger version

Name:	barlabels_G3.png
Views:	1
Size:	23.3 KB
ID:	1565416



            If you are still thinking, this is more messing around for a very simple problem than I want, isn't there a simpler solution? then I sympathise.

            Possible solution: Use twoway bar instead.

            For bars side-by-side, you need to work a little at defining offset variables, but now your horizontal axis labels really are controllable simply and directly by xlabel().

            Code:
            gen yearL = year - 0.15
            gen yearR = year + 0.15
            
            twoway bar invest yearL, barw(0.3) color(red) || bar kstock yearR, barw(0.3) color(blue) xla(1935(5)1955) name(G4, replace) `opts2' subtitle(4, size(medlarge) placement(w))
            I plot one series of bars against one year variable offset left and the other series against one year variable offset right so twoway doesn't know what I want as x axis title, which is fine by me, because I don't need a dopey axis title like "year" at all. (If I want to explain that series are plotted against year I can do that in the text option I write in my text editor or word processor.)
            Click image for larger version

Name:	barlabels_G4.png
Views:	1
Size:	26.8 KB
ID:	1565417






            Last edited by Nick Cox; 26 Jul 2020, 05:53.

            Comment


            • #7
              In case 4 of the previous post the restriction if company == 1 is needed just as in all previous cases.

              In the case of the Grunfeld data there is another easy work-around which is to work with

              Code:
              gen year2 = year - 1900
              and then the time variable runs from 33 to 54.

              Trouble is, this won't work nicely for years that end in 00 to 09.

              Suppose the data are for 2000 to 2019 and you subtract 2000. Then most readers would want to see 00 to 19 if two digits only were used, and as above graph bar won't honour the display format needed to get that to work.
              Last edited by Nick Cox; 27 Jul 2020, 11:44.

              Comment

              Working...
              X