Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create a clustered stacked bar graph across categories rather than within

    I need to create a clustered stacked bar graph showing the percentage of unemployed people who got a job across age groups, gender and sectors (all in one graph). I haven't looked at how to cluster them yet as I am having problems with the stacked part at the moment I have created the graph in excel (please see below), but I have been asked to do it in Stata.

    What I want, but in Stata:
    Click image for larger version

Name:	Graph_Excel.png
Views:	1
Size:	115.9 KB
ID:	1339383


    I have 164, 721 observations in total. Total men = 83,498 and total women = 81,223.

    I have divided the data into 3 age groups:
    age_group1 = < 30
    age_group2 = 30 - 44
    age_group3 = > 45

    I have 10 sectors that fall under the variable "Activity".

    For each sector, I want the following percentages stacked on top of each other:
    sector 1: (men_in_age_group1)/ Total_men x 100 , (men_in_age_group2)/ Total_men x 100 , (men_in_age_group3)/Total_men x 100

    where "men_in_age_group1" is the number of men in that age group in that sector and "Total_men" is the total number of men in the data set = 83,498.

    Then the same for women for sector 1, clustered with the bar above.


    The problem is that Stata's stacked bar graphs add up to 100% within each category (in this case each sector). I created three variables:

    gen men_age1=1 if sex==1 & age==1
    replace men_age1=0 if sex==1 & age > 1

    gen men_age2=1 if sex==1 & age==2
    replace men_age2=0 if sex==1 & age!= 2

    gen men_age3=1 if sex==1 & age==3
    replace men_age3=0 if sex==1 & age!= 3

    I tried the following graph command:

    graph bar men_age1 men_age2 men_age3, over(Activity) stack

    but, the percentages it is graphing are:

    sector1: (men_in_age_group1)/ Total_men_in_sector1 , (men_in_age_group2)/Total_men_in_sector1, (men_in_age_group3)/Total_men_in_sector1

    or:
    Click image for larger version

Name:	Graph_Stata.png
Views:	1
Size:	23.9 KB
ID:	1339384


    The problem is the denominator. I have tried searching for help online, but haven't been able to find anything, so far.

    I am using Stata 13. Also, this is my first post, so please let me know if there's anything else I should add or if anything is unclear.

    Thank you!!


  • #2
    Without seeing an example of your data (see FAQ #12 about using dataex), it is difficult to address this. However, note that the default is to use the mean unless you give another statistic in the graph bar command: graph bar (sum)..., graph bar (total)..., etc
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      My apologies. Below is the code. Please note that although the graph using the example below isn't the exact same graph I attached earlier (that one uses the full sample), the same problem persists.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte sex float age byte Activity float(men_age1 men_age2 men_age3)
      2 2  9 . . .
      1 2  2 0 1 0
      2 3  5 . . .
      2 3  4 . . .
      1 2  1 0 1 0
      2 3  8 . . .
      2 2  4 . . .
      1 2  8 0 1 0
      2 3  9 . . .
      2 2  9 . . .
      1 2  8 0 1 0
      1 1  8 1 0 0
      1 1  2 1 0 0
      2 1  4 . . .
      2 1  2 . . .
      1 2  8 0 1 0
      2 2  8 . . .
      1 1  1 1 0 0
      1 3  8 0 0 1
      1 2  1 0 1 0
      2 3  7 . . .
      2 1  1 . . .
      1 3 10 0 0 1
      1 3  3 0 0 1
      2 3 10 . . .
      2 2  5 . . .
      2 2  8 . . .
      2 3  5 . . .
      1 3  4 0 0 1
      1 2  4 0 1 0
      1 3  2 0 0 1
      1 2  4 0 1 0
      1 2  3 0 1 0
      1 3  2 0 0 1
      2 2  9 . . .
      1 1  2 1 0 0
      2 1 10 . . .
      1 1  3 1 0 0
      1 1  4 1 0 0
      1 3  8 0 0 1
      2 1  6 . . .
      2 1  9 . . .
      2 1  4 . . .
      1 1  2 1 0 0
      2 1  9 . . .
      2 3  5 . . .
      2 2  9 . . .
      2 2  5 . . .
      2 2  8 . . .
      1 3  4 0 0 1
      1 1 10 1 0 0
      1 1  2 1 0 0
      1 3  5 0 0 1
      1 2  4 0 1 0
      1 3  2 0 0 1
      2 3  5 . . .
      2 1  8 . . .
      2 3  9 . . .
      2 1  8 . . .
      1 1  2 1 0 0
      end
      label values sex sexo_dem
      label def sexo_dem 1 "Men", modify
      label def sexo_dem 2 "Women", modify
      label values age gedad
      label def gedad 1 "<30", modify
      label def gedad 2 "30-44", modify
      label def gedad 3 ">45", modify

      Comment


      • #4
        Thanks for providing data, but some of these variables just repeat information. I don't think you need the men_age*

        There aren't any women in your sample, if I understand it correctly.

        Comment


        • #5
          Here's some code that will get you started in the right direction. This does not produce a pretty graph, so will require some effort to make it look nice. The numbers in the graph are just there to confirm that the bars match the percentages that you want--you should remove that option going forward (in blue).


          Code:
          bysort sex age Activity: egen count_sex_age_sec=count(sex)
          bysort sex: egen count_sex=count(sex)
          gen percent=count_sex_age_sec / count_sex *100
          format percent %5.1g
          egen tag=tag(sex age Activity)
          gr bar (asis) percent if tag==1, over(age) over(sex) over(Activity) asyvar stack blabel(bar,format( %3.2g))
          Stata/MP 14.1 (64-bit x86-64)
          Revision 19 May 2016
          Win 8.1

          Comment


          • #6
            That's perfect, exactly what I needed! Thanks so much for your help!

            Comment

            Working...
            X