Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Slant lines in histogram

    I have run a small experiment with 3 treatments (i.e. 4 groups total including control), and am trying to overlay demographic charts to show the groups are largely comparable.

    The following command to generate overlaid histograms for the 4 groups for property income (catgeorical) generates the following chart.

    Code:
    twoway (hist income2 if control == 1, fcolor(blue%0) lcolor(blue) frequency xlabel(#6, valuelabel labsize(vsmall))  width(1) start(1) discrete xtitle("")) ///
    (hist income2 if treat1 == 1, fcolor(red%0) lcolor(red) frequency xlabel(#6, valuelabel labsize(vsmall))  width(1) start(1) discrete xtitle("")) ///
    (hist income2 if treat2 == 1, fcolor(green%0) lcolor(green) lpattern(dash) frequency xlabel(#6, valuelabel labsize(vsmall))  width(1) start(1) discrete xtitle(""))  ///
    (hist income2 if treat3 == 1, fcolor(black%0) lcolor(black) lpattern(dash) frequency xlabel(#6, valuelabel labsize(vsmall))  width(1) start(1) discrete xtitle("")) ///
    , legend(order(1 "Control" 2 "Treatment 1" 3 "Treatment 2" 4 "Treatment 3"))
    Click image for larger version

Name:	Screenshot 2022-04-14 155933.png
Views:	1
Size:	59.2 KB
ID:	1659606




    What I don't get is what are the slanted lines "joining" certain histogram bars to other, and how to get rid of them.

    I should mention that for certain categorical codings (12, 13, 15), there are no observations.

    P.S. This does not seem to be a problem unique to `twoway`. I get the same output if I run `hist`

    Code:
    hist income2 if treat3 == 1, fcolor(black%0) lcolor(black) lpattern(dash) frequency xlabel(#6, valuelabel labsize(vsmall))
    Click image for larger version

Name:	mono.jpg
Views:	1
Size:	21.7 KB
ID:	1659609
    Last edited by Pratap Pundir; 14 Apr 2022, 09:08.
    Thank you for your help!

    Stata SE/17.0, Windows 10 Enterprise

  • #2
    With 4 categories, it is difficult to make out the frequencies with your design. Consider tabplot from the Stata Journal. You will need to do the binning yourself.

    Code:
    findit tabplot

    Otherwise, see https://www.statalist.org/forums/for...-diagonal-line which reports a similar issue.
    Last edited by Andrew Musau; 14 Apr 2022, 09:57.

    Comment


    • #3
      I agree with Andrew Musau that -- although the immediate problem is evidently a bug with lp(dash) -- histograms aren't going to work very well here. You have binned income data but I'd still prefer some kind of quantile or cumulative distribution plot for such data.

      Comment


      • #4
        Your data could be shown here as 4 x 16 frequencies,

        Comment


        • #5
          Thank you Andrew Musau and Nick Cox !

          Yes, the lines are an artefact of a lpattern(dash) bug.

          The data itself are categorical, unfortunately - I don't know actual incomes, just which bin a particular observation's income falls into.

          Is there another, better way of plotting the comparison to show that the four groups are similarly distributed on income?
          Thank you for your help!

          Stata SE/17.0, Windows 10 Enterprise

          Comment


          • #6
            I've already made suggestions in #3. See #4 for the suggestion that you show whatever data you're plotting in #1 i.e.


            Code:
            dataex income2 control treat? 

            Comment


            • #7
              #6 would be better as showing the results of

              Code:
              contract income2 treat? control 
              list

              Comment

              Working...
              X