Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding xline to histogram plots ploted over a categorical variable

    I am trying to plot the expected distribution of yield (histogram) for crops for different types of farmer. The histogram of yields for each farmer type appears in the panel form.
    How can I add xlines to every panel plots? xlines preferably could be the mean yields for each category(types of farmer).
    Provided below is the code for my histogram plots without xlines.

    Code:
    hist  exp_yield if crop==1, by(I6, total  style(compact) xrescale yrescale note("")  subtitle(, ring(0) pos(1) nobexpand nobox size(*.7))) color(eltgreen) barwidth(3) xtitle("Expected distribution of yield: in 100 kgs" , size(*0.7)) ytitle("") ylab(, nogrid) scheme(s1mono)

  • #2
    Without a graph or a data example from you some guessing is needed. Getting the same xlines on each plot is easy, while getting a different xline on each seems much harder, which is why you are asking.

    A miniature review of added lines is in press at the Stata Journal, but a conversation with the Editors leads me to guess that it may not appear until 24(3) or 24(4).

    kg is the standard scientific abbreviation for kilogram, whether singular or plural.

    Here is one line of thought,

    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    . egen mean = mean(mpg), by(foreign)
    
    . histogram mpg, by(foreign) start(10) width(2)
    
    . gen where = 0.15
    
    . histogram mpg, by(foreign) start(10) width(2) addplot(spike where mean) legend(order(2 "mean"))
    The idea is that looking at a draft graph lets you see that drawing lines up to 0.15 would work quite well.
    Click image for larger version

Name:	histo_line2.png
Views:	1
Size:	41.7 KB
ID:	1748882



    There are several ways to improve that display, but I am just addressing the question.

    However, it's hard to see how this could be compatible (1) with yrescale (indeed hard to see why that helps; isn't the point of the display to compare farmer types?) or (2) with a total suboption to by().

    For (2) other trickery is possible: see https://www.stata-journal.com/articl...article=gr0058

    My advice on (1) is just don't do that.

    There are other approaches to histograms in any case. Compare this application of stripplot from SSC.

    Code:
     stripplot mpg, over(foreign) stack refline(lw(medthick)) vertical ms(Sh) height(0.3) aspect(1)
    Click image for larger version

Name:	histo_line3.png
Views:	1
Size:	33.2 KB
ID:	1748883



    Indeed, histograms have many disadvantages, being dependent on choices of bin width and origin and sometimes obscuring interesting or important detail.
    Side-by-side quantile plots can work well.

    Code:
    stripplot mpg, over(foreign) refline(lw(medthick)) vertical cumul cumprob ms(Sh) aspect(1) centre xla(, noticks)
    Click image for larger version

Name:	histo_line4.png
Views:	1
Size:	33.5 KB
ID:	1748884

    Comment


    • #3
      Thank you, Nick! Dropping "total" and "yrescale" is the solution until extended xlines and ylines features are added to STATA. I will also definitely explore striplot. Finally, extended thanks for pointing out an apparent mistake with kg.

      Comment


      • #4
        Glad it helped.

        But I doubt that Stata will ever support xline() in the way that you want. See also FAQ #18.

        Comment

        Working...
        X