Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Human readable dates on graphs

    All-
    I am creating a box plot graph with a %d formatted date ([edate]) as my x-axis category. While [edate] is readable as a regular date (e.g. 01Oct2014) in the datafile, when I created the graph, the variable becomes a 5-digit number that doesn't make immediate sense. I have tried to reformat the variable to date in the 'edit graph' options but that does not seem to work. Any advice would be greatly appreciated. Best-Patrick

  • #2
    I've seen and solved this problem before but can't find the posts. In essence, graph box doesn't provide a handle (that is obvious to me) to honour display formats. It will respect value labels. Hence, study this for technique:

    Code:
    clear 
    input date y 
    21000   1
    21000   2
    21000   3  
    21001   4  
    21001   5
    21001   6 
    end
    format date %td
    graph box y, over(date) 
    
    gen sdate = string(date, "%tdD_M_CY") 
    * install labmask from Stata Journal site after -search labmask- 
    labmask date, values(sdate) 
    
    graph box y, over(date)

    Comment


    • #3
      Wow! This is an interesting solution that seems to work! Many thanks-Patrick

      Comment


      • #4
        Here's a followup question. Since the x axis is a date, you might not want to treat it as a category but as numeric. A categorical x axis will be equally spaced, a numerical x axis might not be.

        E.g., if I took measurements on July 1 & 2, and then again on July 7 & 8, I would want more space between the measurements that are taken a week apart than between the measurements that were taken a day apart. 7 times as much space, to be exact.

        By default, graph box will space the four days equally. See my modification of Nick's code, below. Is there a way to get the results that I'm looking for?

        (Bonus points if you can do it with Winters' and Nichols' SSC command vioplot instead of graph box.)

        Code:
        clear
        input date y
        21001 1
        21001 2
        21001 3
        21002 1
        21002 2
        21002 4
        21008 4
        21008 5
        21008 6
        21009 3
        21009 5
        21009 6
        end
        format date %td
        graph box y, over(date)

        gen sdate = string(date, "%tdD_M_CY")
        ssc install labutil
        labmask date, values(sdate)

        graph box y, over(date)

        Comment


        • #5
          You can subtract as many points as you like because I dislike violin plots. (No disrespect to the authors concerned who are personally well known!)

          stripplot (SSC) is the complete opposite to graph box in this respect as it takes the axis position literally.

          Code:
          stripplot y, over(date) vertical box
          would be the start of a command.

          Comment


          • #6
            Thank you. I'm having trouble getting stripplot to display dates as dates. For example, in my data, readtp is a test score, and test_date is a test date, formatted as follows:

            format test_date %d

            Now some graphing commands, like scatter, will display test_date as a date, and even let me use dates to position and label the xticks so they line up with the beginning and end of each school year:

            scatter readtp test_date, tlabel(25aug2010 "Kindergarten" 31may2011 " " 25aug2011 "1st grade" 31may2012 " ", grid)

            By contrast, stripplot labels the xticks with numbers, not dates:

            stripplot readtp, over(test_date) vertical box

            Following an earlier hint, I can use labmask to make stripplot display test_date as a date:

            gen sdate = string(test_date, "%tdD_M_CY")
            * install labmask from Stata Journal site after -search labmask-
            labmask test_date, values(sdate)
            stripplot readtp, over(test_date) vertical box

            But I still can't customize the horizontal axis as I could with scatter. stripplot doesn't allow the tlabel option, and if I use the xlabel option it goes back to labeling the horizontal axis with numbers instead of dates.

            How can I get stripplot to display test_date as a date and let me customize the position and labeling of ticks along the horizontal axis?






            Comment


            • #7
              stripplot (SSC, as you are asked to explain) is not especially literal in devising a categorical axis. Regardless of variable supplied, stripplot maps categories to new variables with values 1 up.

              I think you're going to need to customise by specifying text as well as values.

              Code:
              sysuse auto 
              stripplot mpg, over(rep78) vertical xla(1 "Some" 2 "talk" 3 "of" 4 "Alexander" 5 "!!!")

              Comment


              • #8
                Dear All,

                I have a follow up question on this thread after some time:
                I have exactly the same problematic as described in #1, only that I have the date on the y-axis. After following the tips of this thread, I have coded:

                Code:
                gen edate = mdy(month, day, year)
                format edate %d
                gen sdate = string(edate, "%tdD_M_CY")
                labmask edate, values(sdate)
                    
                graph box edate, over(round) ///
                title("Date of survey by round", size(medium)) ytitle("Date of survey") graphregion(margin(4 4 8 8) color(white)) ///
                ylabel(, valuelabel nogrid)
                However, on the y-axis I have still displayed the numeric format from "edate" instead of a readable date format.
                Here an example of my data structure:



                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input byte(round day month) int year float edate str17 sdate
                1 13 6 2020 22079 "13 June 2020"
                1 15 6 2020 22081 "15 June 2020"
                1 12 6 2020 22078 "12 June 2020"
                1 19 6 2020 22085 "19 June 2020"
                1 26 6 2020 22092 "26 June 2020"
                1 12 6 2020 22078 "12 June 2020"
                1 10 6 2020 22076 "10 June 2020"
                1 23 6 2020 22089 "23 June 2020"
                1 24 6 2020 22090 "24 June 2020"
                1 12 6 2020 22078 "12 June 2020"
                Can someone tell me how to solve this issue, by any chance?

                Many thanks in advance!

                Comment


                • #9
                  labmask is from the Stata Journal, as you are asked to explain.

                  This may arise because value labels are not defined for all the dates that graph box wants to show on the axis, although elsewhere that is not problematic.

                  Either way, a work-around is just to use a display format.

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input byte(round day month) int year float edate str17 sdate
                  1 13 6 2020 22079 "13 June 2020"
                  1 15 6 2020 22081 "15 June 2020"
                  1 12 6 2020 22078 "12 June 2020"
                  1 19 6 2020 22085 "19 June 2020"
                  1 26 6 2020 22092 "26 June 2020"
                  1 12 6 2020 22078 "12 June 2020"
                  1 10 6 2020 22076 "10 June 2020"
                  1 23 6 2020 22089 "23 June 2020"
                  1 24 6 2020 22090 "24 June 2020"
                  1 12 6 2020 22078 "12 June 2020"
                  end 
                      
                  graph box edate, over(round) ///
                  title("Date of survey by round", size(medium)) ytitle("Date of survey") graphregion(margin(4 4 8 8) color(white)) ///
                  ylabel(, format(%tddd_Month_CCYY) ang(h) nogrid)
                  Code:
                  
                  

                  Comment


                  • #10
                    Dear Mr Cox,

                    thanks a lot for your help, setting the format worked.

                    Kind regards

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      I've seen and solved this problem before but can't find the posts. In essence, graph box doesn't provide a handle (that is obvious to me) to honour display formats. It will respect value labels. Hence, study this for technique:

                      Code:
                      clear
                      input date y
                      21000 1
                      21000 2
                      21000 3
                      21001 4
                      21001 5
                      21001 6
                      end
                      format date %td
                      graph box y, over(date)
                      
                      gen sdate = string(date, "%tdD_M_CY")
                      * install labmask from Stata Journal site after -search labmask-
                      labmask date, values(sdate)
                      
                      graph box y, over(date)
                      Dear Dr. Cox,

                      Just a note to thank you. I was going nuts in trying to display date in a Box plot. Searched the forum and you have the solution. All the best.

                      Comment

                      Working...
                      X