Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random gap in my histogram

    Hi,

    I use the following code to plot a histogram:

    Code:
        
    hist social, percent xlabel(0 1 2 3 4 5 6 7 8 9 10 "10<") scheme(s1mono) barwidth(1)
    I have observations for social ranging from 1 to 10. For each integer, I have at least one observation.

    But I get the following plot: Why is there a gap between 4 and 5?
    Click image for larger version

Name:	hist-social.png
Views:	1
Size:	26.4 KB
ID:	1757045

  • #2
    Try the discrete option and it should disappear.

    (My guess is that you got a message from hist on what bin width it was using, and that wasn't exactly 1. Specifying the barwidth is cosmetic and not equivalent. For example, you could choose barwidth(0.9) to flag that the variable in question is discrete/)
    Last edited by Nick Cox; 24 Jun 2024, 09:44.

    Comment


    • #3
      By way of more diagnosis, note that careful scrutiny of the result in #1 shows that the bars overlap, which is easiest to see in the right tail. That underlines that the bin width and the bar width are not identical with the command given. The barwidth() is explicitly 1 but the bin width was chosen by the command.

      With data like yours, the discrete option is I think essential, but what you do with the bar width is at choice. I hope we can agree that there is no rationale for bar widths more than 1. I would tend to use a bar width a bit less than 1 -- say 0.9 -- as a way of underlining that the variable is discrete. That's more about the psychology of graph decoding than about statistics or science.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        By way of more diagnosis, note that careful scrutiny of the result in #1 shows that the bars overlap, which is easiest to see in the right tail. That underlines that the bin width and the bar width are not identical with the command given. The barwidth() is explicitly 1 but the bin width was chosen by the command.

        With data like yours, the discrete option is I think essential, but what you do with the bar width is at choice. I hope we can agree that there is no rationale for bar widths more than 1. I would tend to use a bar width a bit less than 1 -- say 0.9 -- as a way of underlining that the variable is discrete. That's more about the psychology of graph decoding than about statistics or science.
        Very good point! Thank you so much

        Comment

        Working...
        X