Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Displaying Categories with zero observations in bar chart

    Dear all,

    I have observational data in a variable called taskcompletionrate. This variable contains informations on the percentage of tasks completed by healthworkers. I have 10 categories in this variable (1= 0.10%, 2=10-20% and so on). I want to create a bar graph displaying the percentages of observations within each category, but I would like the x-axis to also show the categories that do not have observations in them (in my case there are only observations in categories 8,9, and 10).

    Code:
    gen taskcompletionrate=summated_score_stand
    recode taskcompletionrate 0/9.99=1 10/19.99=2 20/29.99=3 30/39.99=4 40/49.99=5 50/59.99=6 60/69.99=7 70/79.99=8 80/89.99=9 90/100=10
    tab taskcompletionrate
    graph bar (percent), over (taskcompletionrate) allcategories
    but this does only show a bar graph with the categories 8,9, and 10. Is there an option or modification in Stata to display all categories?

    Thanks a lot
    Best
    Theresa

  • #2
    In essence you have a variable that is a percent(age) and you want to make explicit that values above 70% are possible but did not occur.

    That is clear as a goal but the recode is quite unnecessary here and graph bar is not a good fit for the problem -- if only for the reason for the question, which is that graph bar doesn't really have any notion of what might have occurred but did not. Also, if the variable is essentially continuous, starting with discrete bars is also a poor fit.

    You could just go

    Code:
    histogram summated_score_stand, xla(0(10)100) start(0) width(10) percent
    with an advantage that you can change the bin width easily if needed. In fact some variation on

    Code:
    quantile summated_score_stand, rlopts(lc(none)) yla(0(10)100, ang(h)) 
    shows that binning is not needed to show the values (and limits) of a distribution.

    If you still need or at least prefer to bin the data. then

    Code:
    gen degraded = ceil(gooddata/10) 
    
    gen degraded2 = 10 * ceil(gooddata/10)
    are cleaner versions of your mapping.

    Comment


    • #3
      Thanks a lot. The histogram option works.
      I do understand why recoding was unnecessary here, but in fact my initial variable (summated_scale) was created out of two groups (one with 9 steps in the checklist and one with only 5 out of 9 steps). I then created a standardized summated scale (summated_scale_stand) that applied both checklists to a score from 0-100 in order for it to be easier interpreted as task completion rate (proportion of tasks completed). I then wanted to recode the variable as a categorical one to display the proportion as categories.

      Comment


      • #4
        Thanks for the detail but I can’t follow why arbitrary binning makes anything easier to follow. Other way round, my guess is that your coding means that only certain values will occur in your summated scale which is why I would want see a quantile plot or something equivalent.

        Comment

        Working...
        X