Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ordering bar graphs when using over() and by()

    Hello,

    I am trying to sort gender across different categories but I want the data sorted based on prespecified order. Currently, STATA is ordering the graphs based on the values on area variable(From 1 to 5). I want to order them in a way that if the gender ratio is similar then it is next to each other, and then increases based on the higher number of males or females in the next area.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(sex area)
    1 2
    1 4
    1 3
    1 1
    1 2
    1 4
    1 1
    1 5
    1 2
    0 4
    end
    label values sex sex
    label def sex 0 "Female", modify
    label def sex 1 "Male", modify
    ------------------ copy up to and including the previous line ------------------


    I am currently running the following code:

    Code:
    graph bar (percent), over(sex, sort(#)) by(area) asyvars

    I also want to inquire if I could write a code to have all the graphs in a single row. I know that it can be done in the graph editor but having a code for it would be quite helpful as well. If you could also guide as to how I could show the number of observations using blabel as well as how I could show the total observations of each area using coding. Thank you
    Last edited by Aqib Yousuf; 17 Feb 2023, 02:58.

  • #2
    I also want to inquire if I could write a code to have all the graphs in a single row. I know that it can be done in the graph editor but having a code for it would be quite helpful as well. Thank you!

    Comment


    • #3
      The main problem here is addressed by the command myaxis from the Stata Journal which is written up at https://journals.sagepub.com/doi/pdf...6867X211045582 and also earlier at https://www.statalist.org/forums/for...e-or-graph-use

      Beyond using myaxis to get the order you want, I recommend

      1. In your real data you are probably using area names not numbers, and there may be many more areas than 5. In that circumstance a vertical bar chart (often called a column chart) is often a poor choice. So use a horizontal display.

      2. Choosing one variable to plot, say the percent of males or the percent of females.

      2. If and only if the percents in your full dataset cover a narrow range, say near 50%, a bar chart is likely to be a poor way to show the data, as comparisons with zero are not at all the issue, but comparisons of the areas with each other are the issue.

      Note that as in the syntax for G2 you can use a mean to get proportions but show percents in axis labels.

      Colour choices are naturally up to you, especially of colours that will make sense to (and not offend) your readership.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte(sex area)
      1 2
      1 4
      1 3
      1 1
      1 2
      1 4
      1 1
      1 5
      1 2
      0 4
      end
      label values sex sex
      label def sex 0 "Female", modify
      label def sex 1 "Male", modify
      
      myaxis AREA = area, sort(mean sex)
      
      set scheme s1color
      
      graph bar (percent), over(sex) by(AREA, row(1)) asyvars bar(1, color(orange_red)) bar(2, color(blue)) name(G1, replace)
      
      graph hbar sex, over(AREA) yla(0 .25 "25" .5 "50" .75 "75" 1 "100") ytitle(% of males) ysc(alt) l1title(Area) bar(1, color(blue)) name(G2, replace)
      
      graph dot sex, over(AREA) ytitle(% of males) ysc(alt) l1title(Area) exclude0 marker(1, mcolor(blue)) name(G3, replace)
      .
      Click image for larger version

Name:	areasex1.png
Views:	1
Size:	18.5 KB
ID:	1702116

      Click image for larger version

Name:	areasex2.png
Views:	1
Size:	12.9 KB
ID:	1702117


      Attached Files
      Last edited by Nick Cox; 17 Feb 2023, 03:25.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        The main problem here is addressed by the command myaxis from the Stata Journal which is written up at https://journals.sagepub.com/doi/pdf...6867X211045582 and also earlier at https://www.statalist.org/forums/for...e-or-graph-use

        Beyond using myaxis to get the order you want, I recommend

        1. In your real data you are probably using area names not numbers, and there may be many more areas than 5. In that circumstance a vertical bar chart (often called a column chart) is often a poor choice. So use a horizontal display.

        2. Choosing one variable to plot, say the percent of males or the percent of females.

        2. If and only if the percents in your full dataset cover a narrow range, say near 50%, a bar chart is likely to be a poor way to show the data, as comparisons with zero are not at all the issue, but comparisons of the areas with each other are the issue.

        Note that as in the syntax for G2 you can use a mean to get proportions but show percents in axis labels.

        Colour choices are naturally up to you, especially of colours that will make sense to (and not offend) your readership.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte(sex area)
        1 2
        1 4
        1 3
        1 1
        1 2
        1 4
        1 1
        1 5
        1 2
        0 4
        end
        label values sex sex
        label def sex 0 "Female", modify
        label def sex 1 "Male", modify
        
        myaxis AREA = area, sort(mean sex)
        
        set scheme s1color
        
        graph bar (percent), over(sex) by(AREA, row(1)) asyvars bar(1, color(orange_red)) bar(2, color(blue)) name(G1, replace)
        
        graph hbar sex, over(AREA) yla(0 .25 "25" .5 "50" .75 "75" 1 "100") ytitle(% of males) ysc(alt) l1title(Area) bar(1, color(blue)) name(G2, replace)
        
        graph dot sex, over(AREA) ytitle(% of males) ysc(alt) l1title(Area) exclude0 marker(1, mcolor(blue)) name(G3, replace)
        . [ATTACH=CONFIG]n1702116[/ATTACH]
        [ATTACH=CONFIG]n1702117[/ATTACH]
        Thank you! This sorts out my issue. What other visualizations would you recommend for across area graphs?

        Comment


        • #5
          For what you’ve posted bar charts are widely used but in my view dot charts often work better, as already stated.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            For what you’ve posted bar charts are widely used but in my view dot charts often work better, as already stated.
            Is there a way to show number of observations in the bar charts? Thanks

            Comment


            • #7
              You started a new thread on #6 over at https://www.statalist.org/forums/for...in-a-bar-chart where I have replied too.

              Comment

              Working...
              X