Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Changing value of labels in x-axis, graph bar.

    Hi everyone!

    I am working with a panel dataset containing the value and type of properties in a municipality. I have access to the type of land use (residential, commercial, industrial, etc.). I'm trying to build a bar graph containing the composition of the type of properties across percentiles of the property value (i.e. lowest stages of the distribution are mostly composed of residential and commercial properties). I'm doing the following steps:
    • Collapse data to get the number of properties by percentiles and type of properties. What I get is a small dataset in a long format with the number of properties in the first percentile that are residential, commercial, and so on.
    • Reshape the previous long dataset to a wide format so that in rows I have the percentiles, and in columns, I have the number of properties for each category. After the reshape adjustment, I get the following dataset:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    
    percentile unit0 unit1 unit2 unit3 unit4 unit5 unit6 unit7 unit8
     0 168  948  24  48 11  11  1  . .
     1 164  793  43 161 51   .  .  . .
     2 193  801  33 155 25   1  1  . .
     3 143  866  35 154 10   3  .  . .
     4 142  845  46 160 10   8  1  . .
     5 143  867  35 150 11   3  1  . .
     6 161  851  49 129 17   7  1  . .
     7 133  927  25 103 13   3  1  1 .
     8 152  917  43  88  7   2  1  . .
     9 133  930  38  92 12   6  .  . .
    10 139  933  36  92  7   4  .  . .
    11 123  932  53  86  9   5  2  . .
    12 143  929  47  70 16   5  .  1 .
    13 147  921  43  80 11   9  .  . .
    14 132  943  37  82  7   9  .  . .
    15 146  957  33  64  4   7  .  . .
    16 122  960  49  66  6   8  .  . .
    17 123  971  36  68  6   7  .  . .
    18 142  964  29  64  2  10  .  . .
    19 129  984  38  49  4   5  1  . .
    20 123  987  25  53 10  11  .  1 .
    21 138  981  47  35  3   5  1  1 .
    22 137  962  44  53  7   8  .  . .
    23 138  973  30  46 12  10  .  1 .
    24 134  974  44  52  2   5  .  . .
    25 129  993  43  34  4   8  .  . .
    26 133 1009  26  33  3   6  .  . .
    27 112 1021  32  34  7   5  .  . .
    28 124  996  45  38  .   8  .  . .
    29 110 1017  34  30  7  10  1  1 .
    30 109 1037  30  29  6   .  .  . .
    31 136 1045  36  38  4   4  .  1 .
    32 138  932  41  38  3   6  1  . .
    33 127  989  37  46  6   4  .  . .
    34 119 1021  40  24  3   3  .  1 .
    35 125  989  42  35  9   7  2  1 .
    36 128  989  39  29 13  12  1  . .
    37 140  985  44  26 11   6  .  . .
    38 136  983  40  36  7   6  .  1 .
    39 106 1011  45  28  5  15  .  1 .
    40 115 1004  48  28  8   5  1  2 .
    41 137  972  50  36  4   9  2  . .
    42 157  961  42  29 11  10  1  . .
    43 120  996  52  30  2  11  .  . .
    44 123 1012  41  21  6   6  .  1 .
    45 144  974  45  37  7   4  .  . .
    46 116 1031  29  21  5   7  .  2 .
    47 117  971  67  30  6  18  1  . .
    48 118  999  49  33  3   8  1  . .
    49 144  995  41  21  2   7  .  . .
    50 115  996  48  37  7   6  2  . .
    51 131  968  63  29  8  12  .  . .
    52 132  965  67  29  7   9  .  1 .
    53 150  953  64  20  6  17  1  . .
    54 126  973  65  34  9   4  .  . .
    55 123  952  77  26  9  20  .  3 .
    56 129  958  79  26  7  11  2  . .
    57  90  979  82  27 13  18  1  . .
    58 109  972  62  45  5  16  .  1 .
    59 128  957  73  26  6  18  2  1 .
    60 135  956  78  47  5  16  4  2 .
    61 108  939  82  30  7  12  .  . .
    62 109  936  92  34 19  14  4  3 .
    63 122  942  78  44 11  12  2  . .
    64 103  967  86  29  7  16  1  1 .
    65  98  951  95  44  8  15  .  . .
    66 112  936 103  36  5  19  .  . .
    67 106  928 107  38  9  21  1  . .
    68 103  970  74  36  8  19  .  1 .
    69 121  926  95  39  5  22  3  . .
    70 130  901 103  36  8  30  1  1 .
    71 112  903 120  34  7  35  .  . .
    72 120  888 108  51  6  35  .  3 .
    73  87  924 113  41  7  33  3  2 .
    74  90  926 106  43  7  35  2  2 .
    75  92  920 123  34  4  38  3  1 .
    76 105  897 130  37  5  24  5  3 .
    77  97  902 117  51  4  39  .  1 .
    78  85  897 149  41  3  35  1  . .
    79  94  886 129  46  6  45  3  1 .
    80  88  874 147  50  5  46  .  1 .
    81  95  874 144  51  7  38  1  1 .
    82  93  847 153  62  8  47  .  . .
    83 102  850 144  58  3  51  1  2 .
    84  98  865 131  64  8  43  1  1 .
    85  78  842 171  59  4  52  3  1 .
    86  94  834 151  67  8  54  .  3 .
    87  98  773 190  67  4  76  .  3 .
    88  90  801 179  72  8  53  6  1 .
    89  98  770 184  73  6  75  2  3 .
    90  98  720 209  89  9  80  5  1 .
    91  95  724 209  89  7  79  4  3 .
    92  89  681 239  96  9  89  5  3 .
    93  61  740 234  80  4  89  .  3 .
    94  65  672 240  91  8 125  5  4 .
    95  85  642 236 113 10 117  6  2 .
    96  94  596 245 126 10 135  3  2 .
    97  76  529 249 178 10 154  9  5 .
    98  76  441 253 204  8 210 11  8 .
    99  77  230 146 343 23 343 22 25 1
    end
    • Finally, I use the following code to build my figure.
    Code:
    global graphopt "legend(region(lcolor(none))) graphr(color(white)) ylabel(, nogrid)"
    
    graph bar unit0 unit1 unit2 unit3 unit4 unit5 unit6 unit7 unit8,         ///
                  over(percentile , gap(0))                                         ///
                  percent stack $graphopt ylabel(0(25)100)                             ///
                  legend(row(3) lab(1 "Missing") lab(2 "Residential")                 ///
                  lab(3 "Residential with Commercial/Industrial") lab(4 "Commercial") ///
                  lab(5 "Empty Lot") lab(6 "Industrial") lab(7 "Civil Entities")    ///
                  lab(8 "Religious Entities") lab(9 "Wholesale Establishment") size(vsmall)) ///
                  bar(1, color(dknavy%80) lwidth(none)) bar(2, color(navy%60) lwidth(none)) ///
                  bar(3, color(blue%60) lwidth(none)) bar(4, color(blue%20) lwidth(none)) ///
                  bar(5, color(midblue%60) lwidth(none)) bar(6, color(ebblue%60) lwidth(none)) ///
                  bar(7, color(eltblue%60) lwidth(none)) bar(8, color(gray%60) lwidth(none)) ///
                  bar(9, color(darkgray%60) lwidth(none)) ytitle("Share of owners (%)") yscale(titlegap(3))
    And get the following graph:

    Click image for larger version

Name:	figure.png
Views:	1
Size:	57.3 KB
ID:	1741141


    As can be noticed, the x-axis label turns out messy. What I'd like to get is an x-axis only showing every 25th percentile, but also with the word "Percentile" as a label. Just like this:

    Click image for larger version

Name:	fig2.png
Views:	1
Size:	2.6 KB
ID:	1741142


    How do I get this result?






  • #2
    graph bar is written with the assumption that every bar deserves an explanation. It's quite hard work to suppress the labels but there are various possibilities.

    How to work around this and how to avoid getting stuck in that swamp at all were covered in https://journals.sagepub.com/doi/pdf...6867X211000032

    Stacked bars collectively tell you that the sum is 100% which is reassuring but not informative. Whatever is plotted at the top and bottom is easy to follow; whatever is plotted in the middle not so easy to follow. Unwelcome advice, but my advice nevertheless, is to portray series separately.

    I will post a bit later with some constructive suggestions.

    Comment


    • #3
      Looking at this in more detail, and telling you what you already know: You have 9 categories here and only about 4 are easily visible. I played around with quite different ideas, and only one of those seemed any use. Here it is. myaxis is from the Stata Journal.

      Detail: Your data example is invaluable but needed a small edit to work.

      Code:
      clear 
      input percentile unit0 unit1 unit2 unit3 unit4 unit5 unit6 unit7 unit8
       0 168  948  24  48 11  11  1  . .
       1 164  793  43 161 51   .  .  . .
       2 193  801  33 155 25   1  1  . .
       3 143  866  35 154 10   3  .  . .
       4 142  845  46 160 10   8  1  . .
       5 143  867  35 150 11   3  1  . .
       6 161  851  49 129 17   7  1  . .
       7 133  927  25 103 13   3  1  1 .
       8 152  917  43  88  7   2  1  . .
       9 133  930  38  92 12   6  .  . .
      10 139  933  36  92  7   4  .  . .
      11 123  932  53  86  9   5  2  . .
      12 143  929  47  70 16   5  .  1 .
      13 147  921  43  80 11   9  .  . .
      14 132  943  37  82  7   9  .  . .
      15 146  957  33  64  4   7  .  . .
      16 122  960  49  66  6   8  .  . .
      17 123  971  36  68  6   7  .  . .
      18 142  964  29  64  2  10  .  . .
      19 129  984  38  49  4   5  1  . .
      20 123  987  25  53 10  11  .  1 .
      21 138  981  47  35  3   5  1  1 .
      22 137  962  44  53  7   8  .  . .
      23 138  973  30  46 12  10  .  1 .
      24 134  974  44  52  2   5  .  . .
      25 129  993  43  34  4   8  .  . .
      26 133 1009  26  33  3   6  .  . .
      27 112 1021  32  34  7   5  .  . .
      28 124  996  45  38  .   8  .  . .
      29 110 1017  34  30  7  10  1  1 .
      30 109 1037  30  29  6   .  .  . .
      31 136 1045  36  38  4   4  .  1 .
      32 138  932  41  38  3   6  1  . .
      33 127  989  37  46  6   4  .  . .
      34 119 1021  40  24  3   3  .  1 .
      35 125  989  42  35  9   7  2  1 .
      36 128  989  39  29 13  12  1  . .
      37 140  985  44  26 11   6  .  . .
      38 136  983  40  36  7   6  .  1 .
      39 106 1011  45  28  5  15  .  1 .
      40 115 1004  48  28  8   5  1  2 .
      41 137  972  50  36  4   9  2  . .
      42 157  961  42  29 11  10  1  . .
      43 120  996  52  30  2  11  .  . .
      44 123 1012  41  21  6   6  .  1 .
      45 144  974  45  37  7   4  .  . .
      46 116 1031  29  21  5   7  .  2 .
      47 117  971  67  30  6  18  1  . .
      48 118  999  49  33  3   8  1  . .
      49 144  995  41  21  2   7  .  . .
      50 115  996  48  37  7   6  2  . .
      51 131  968  63  29  8  12  .  . .
      52 132  965  67  29  7   9  .  1 .
      53 150  953  64  20  6  17  1  . .
      54 126  973  65  34  9   4  .  . .
      55 123  952  77  26  9  20  .  3 .
      56 129  958  79  26  7  11  2  . .
      57  90  979  82  27 13  18  1  . .
      58 109  972  62  45  5  16  .  1 .
      59 128  957  73  26  6  18  2  1 .
      60 135  956  78  47  5  16  4  2 .
      61 108  939  82  30  7  12  .  . .
      62 109  936  92  34 19  14  4  3 .
      63 122  942  78  44 11  12  2  . .
      64 103  967  86  29  7  16  1  1 .
      65  98  951  95  44  8  15  .  . .
      66 112  936 103  36  5  19  .  . .
      67 106  928 107  38  9  21  1  . .
      68 103  970  74  36  8  19  .  1 .
      69 121  926  95  39  5  22  3  . .
      70 130  901 103  36  8  30  1  1 .
      71 112  903 120  34  7  35  .  . .
      72 120  888 108  51  6  35  .  3 .
      73  87  924 113  41  7  33  3  2 .
      74  90  926 106  43  7  35  2  2 .
      75  92  920 123  34  4  38  3  1 .
      76 105  897 130  37  5  24  5  3 .
      77  97  902 117  51  4  39  .  1 .
      78  85  897 149  41  3  35  1  . .
      79  94  886 129  46  6  45  3  1 .
      80  88  874 147  50  5  46  .  1 .
      81  95  874 144  51  7  38  1  1 .
      82  93  847 153  62  8  47  .  . .
      83 102  850 144  58  3  51  1  2 .
      84  98  865 131  64  8  43  1  1 .
      85  78  842 171  59  4  52  3  1 .
      86  94  834 151  67  8  54  .  3 .
      87  98  773 190  67  4  76  .  3 .
      88  90  801 179  72  8  53  6  1 .
      89  98  770 184  73  6  75  2  3 .
      90  98  720 209  89  9  80  5  1 .
      91  95  724 209  89  7  79  4  3 .
      92  89  681 239  96  9  89  5  3 .
      93  61  740 234  80  4  89  .  3 .val 
      94  65  672 240  91  8 125  5  4 .
      95  85  642 236 113 10 117  6  2 .
      96  94  596 245 126 10 135  3  2 .
      97  76  529 249 178 10 154  9  5 .
      98  76  441 253 204  8 210 11  8 .
      99  77  230 146 343 23 343 22 25 1
      end
      
      egen total = rowtotal(unit?)
      
      forval j = 0/8 { 
          gen pc`j' = cond(missing(unit`j'), 0, 100 * unit`j'/total)
      }
      
      reshape long pc unit, i(percentile) j(which) 
      
      label def text 0 "Missing" 1 "Residential" 2  `" "Residential/" "Commercial/Industrial" "' 3  "Commercial" 4  "Empty Lot" 5  "Industrial" 6  "Civil" 7  "Religious" 8  "Wholesale" 
      
      label val which text 
      
      myaxis newwhich=which, sort(mean pc) descending 
      
      line pc percentile, by(newwhich, yrescale note("")) xla(0(25)100) yla(#5) ytitle(percent)
      Click image for larger version

Name:	owners.png
Views:	1
Size:	61.0 KB
ID:	1741151

      Comment


      • #4
        Hi Nick!
        I appreciate your suggestions. I didn't have a clue about the Stata tip 140. This is life-changing. I'll follow your idea of showing every series separately.

        Comment


        • #5
          Anyone wishing to experiment with these data should note that some trailing garbage val has been added by accident in #3 for percentile 93. That was probably a result of my typing some text when I thought I was typing in some other place. That should be edited out before trying anything.

          Comment


          • #6
            Just in case someone else is wondering how it turned out, here's the final code (I opted for my first idea of showing the bar graph instead):

            Code:
            preserve
                    replace percentile = percentile + 1
                    keep if anio == 2020
                    gcollapse (count) unit, by(percentile categoria_siempre)
                    reshape wide unit, i(percentile) j(categoria_siempre)
                    drop if missing(percentile)
                    
                        loc x_labels            // Need to make some adjustments for x-axis labels
                        forvalues j = 1/100 {
                            if mod(`j', 25) == 0 | `j' == 1 {
                                loc show = `j'
                                loc x_labels `x_labels' `j' "`show'"
                            }    
                            else {
                                loc x_labels `x_labels' `j' " "
                            }    
                        } 
                        
                    graph bar unit0 unit1 unit2 unit3 unit4 unit5 unit6 unit7 unit8,                     ///
                          over(percentile, gap(0) relabel(`x_labels'))                                     ///
                          percent stack $graphopt ylabel(0(25)100)                                         ///
                          legend(row(3) lab(1 "Missing") lab(2 "Residential")                             ///
                          lab(3 "Residential with Commercial/Industrial") lab(4 "Commercial")             ///
                          lab(5 "Empty Lot") lab(6 "Industrial") lab(7 "Civil Entities")                ///
                          lab(8 "Religious Entities") lab(9 "Wholesale Establishment") size(vsmall))     ///
                          bar(1, color(dknavy%80) lwidth(none)) bar(2, color(navy%60) lwidth(none))     ///
                          bar(3, color(blue%60) lwidth(none)) bar(4, color(blue%20) lwidth(none))       ///
                          bar(5, color(midblue%60) lwidth(none)) bar(6, color(ebblue%60) lwidth(none))  ///
                          bar(7, color(eltblue%60) lwidth(none)) bar(8, color(gray%60) lwidth(none))    ///
                          bar(9, color(darkgray%60) lwidth(none)) b2title("Percentile", size(medsmall)) ///
                          ytitle("Share of owners (%)") yscale(titlegap(3))
                    graph export "Figures/property_type_percentile.pdf", replace
                restore
            Notice that the main change I made was using the 'relabel' option and only adding those numbers related to every 25th percentile, and then, a nice solution for adding a label title was using the b2title option. Next, the final product:

            Click image for larger version

Name:	Captura de pantalla 2024-02-09 134918.png
Views:	1
Size:	28.1 KB
ID:	1742690


            Comment

            Working...
            X