Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coloring Histogram Bars Above/Below Thresholds


    Hi,
    I'm trying to create a histogram that flags height values outside of an indicated range as red, and values within that range as green. My thought was to use the twoway function and plot two separate histograms on top of one another (one for the 'in-range' values, and one for the 'out-of-range' values). However, because the N's are different for each it throws off the bar sizes (and percentages) making the plot look odd. Is there anyway to maintain the same N for twoway plots so the scaling is consistent? I've attached some sample code below:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double bs041
       80
       82
       85
     87.5
       89
     89.5
     95.3
     95.5
    100.7
      101
      101
      105
      105
      113
    113.7
      117
    120.6
    120.8
    120.8
    122.4
      123
      125
    125.9
      126
    126.7
      127
      127
      130
      130
      131
      132
      132
      132
      132
      132
      132
      132
      132
      132
    132.1
    132.6
    132.7
      133
    133.7
      134
      134
      134
      134
      134
      134
      134
    134.6
      135
      135
      135
      135
      135
      135
    135.3
    135.4
    135.6
    135.6
    135.7
    135.7
      136
      136
      136
    136.9
    136.9
    137.5
    137.7
      138
      138
    138.9
    139.5
      140
      140
    140.2
    140.6
    140.7
      141
      142
      142
      142
      142
      142
      142
      142
      142
      142
      142
      142
    142.1
    142.3
    142.7
      143
      143
      143
      143
      143
    end
    This is what (part) of the histogram looks like without applying different colors or restricting the observations at all
    Code:
    histogram bs041, percent color(grey%30) xlabel(50(50)250) ylabel(, angle(0)) xline(125 220, lcolor(black)) text(40 172.5 "Height Range (125-220 cm)")
    This was my attempt - However, it makes the percent for the red bars look huge in comparison when it is just that they have different denominators
    Code:
    twoway histogram bs041 if (bs041 >= 125 & bs041 <= 220), percent color(green%30) || histogram bs041 if (bs041 < 125 | bs041 > 220), percent color(red%30) xlabel(50(50)250) ylabel(, angle(0)) xline(125 220, lcolor(black)) text(40 172.5 "Height Range (125-220 cm)")
    I've also included a (rather crude...) picture of the larger sample as well

    Click image for larger version

Name:	Screenshot 2024-01-12 at 3.12.03 PM.png
Views:	1
Size:	143.1 KB
ID:	1739650


    Thanks!
    David



  • #2
    Perhaps this:

    ssc install frause
    frause oaxaca
    twoway__histogram_gen lnwage, gen(h b) w(0.1)
    twoway bar h b if b<3, barw(0.1) || bar h b if b>=3, barw(0.1) || bar h b if b>=4, barw(0.1)

    Comment


    • #3
      Also see https://www.statalist.org/forums/for...m-or-graph-bar

      Comment


      • #4
        Just a comment that combining red and green should be avoided. Difficulty in distinguishing red and green is a common visual challenge.

        Comment


        • #5
          Following Nick's remark in #4, I recommend reading this recent paper 'open access' published in Nature Communications on the use of colour - color - in science communication, relevant for any form of visualization of quantitative data using plots, graphics: Crameri, F., Shephard, G. E., & Heron, P. J. (2020). The misuse of colour in science communication. Nature communications, 11(1), 5444. https://doi.org/10.1038/s41467-020-19160-7
          http://publicationslist.org/eric.melse

          Comment

          Working...
          X