Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • hgraph - present top % by a group using contract option

    How do I present the top 1st occupation in a horizontal graph for each level of race.

    I tried the below and stata reports:

    Code:
    clear all
    sysuse nlsw88, clear
    preserve 
    contract occupation race, percent(pc)
    graph hbar (asis) pc if pc>30 , over(occupation, sort(1) descending) blabel(bar, format(%3.2f) over(race))
    restore
    ERROR:
    variables occupation and _variables do not uniquely identify the observations
    r(459);


    dataex

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(race occupation) int _freq double pc
    1 1 249 11.086375779162957
    2 1  60  2.671415850400712
    3 1   8  .3561887800534283
    1 2 231 10.284951024042742
    2 2  31 1.3802315227070348
    3 2   2 .08904719501335707
    1 3 548  24.39893143365984
    2 3 170  7.569011576135352
    3 3   8  .3561887800534283
    1 4  90  4.007123775601069
    2 4  12  .5342831700801425
    1 5  34 1.5138023152270703
    2 5  19  .8459483526268923
    1 6 125  5.565449688334818
    2 6 118  5.253784505788068
    3 6   3 .13357079252003562
    1 7  12  .5342831700801425
    2 7  16  .7123775601068566
    1 8 177  7.880676758682101
    2 8 104  4.630454140694568
    end
    label values race racelbl
    label def racelbl 1 "White", modify
    label def racelbl 2 "Black", modify
    label def racelbl 3 "Other", modify
    label values occupation occlbl
    label def occlbl 1 "Professional/Technical", modify
    label def occlbl 2 "Managers/Admin", modify
    label def occlbl 3 "Sales", modify
    label def occlbl 4 "Clerical/Unskilled", modify
    label def occlbl 5 "Craftsmen", modify
    label def occlbl 6 "Operatives", modify
    label def occlbl 7 "Transport", modify
    label def occlbl 8 "Laborers", modify

  • #2
    Is "top occupation" referring to the occupation held by the largest number of people? You probably want to calculate percentages within groups. Below, ties are resolved in favor of the occupation with the highest coded value. More advanced code would display sample sizes of the respective groups.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(race occupation) int _freq
    1 7  12
    1 5  34
    1 4  90
    1 6 125
    1 8 177
    1 2 231
    1 1 249
    1 3 548
    2 4  12
    2 7  16
    2 5  19
    2 2  31
    2 1  60
    2 8 104
    2 6 118
    2 3 170
    3 2   2
    3 6   3
    3 1   8
    3 3   8
    end
    label values race racelbl
    label def racelbl 1 "White", modify
    label def racelbl 2 "Black", modify
    label def racelbl 3 "Other", modify
    label values occupation occlbl
    label def occlbl 1 "Professional/Technical", modify
    label def occlbl 2 "Managers/Admin", modify
    label def occlbl 3 "Sales", modify
    label def occlbl 4 "Clerical/Unskilled", modify
    label def occlbl 5 "Craftsmen", modify
    label def occlbl 6 "Operatives", modify
    label def occlbl 7 "Transport", modify
    label def occlbl 8 "Laborers", modify
    
    bys race (_freq occ): egen pct= total(_freq)
    by race: replace pct= (_freq/ pct)*100
    by race: gen tag= _n==_N
    graph hbar pct if tag, over(occupation) over(race, sort(1) label(angle(vertical))) ytitle(Percent) blab(total, format("%3.0f")) ylab("")
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	21.0 KB
ID:	1744138

    Last edited by Andrew Musau; 22 Feb 2024, 02:52.

    Comment


    • #3
      Thanks for this, if I could take this step further and ask what if I wanted to plot the top 3 categories of occupation for each race

      This should just show on the graph:

      The top 3 occupations for white would be:
      1. Sales
      2. Professional/Technical
      3. Managers/admin
      For black:
      1.Operatives
      2.Labourers
      3.Clerical

      I think a modification of the code in bold is required... May you please help ? where then I can relate this to the code in bold and red
      Code:
      by race: gen tag= _n==_N
      graph hbar pct if tag >=1 & tag <=3, over(occupation) over(race, sort(1) label(angle(vertical))) ytitle(Percent) blab(total, format("%3.0f")) ylab("")

      Comment


      • #4
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte(race occupation) int _freq
        1 7  12
        1 5  34
        1 4  90
        1 6 125
        1 8 177
        1 2 231
        1 1 249
        1 3 548
        2 4  12
        2 7  16
        2 5  19
        2 2  31
        2 1  60
        2 8 104
        2 6 118
        2 3 170
        3 2   2
        3 6   3
        3 1   8
        3 3   8
        end
        label values race racelbl
        label def racelbl 1 "White", modify
        label def racelbl 2 "Black", modify
        label def racelbl 3 "Other", modify
        label values occupation occlbl
        label def occlbl 1 "Professional/Technical", modify
        label def occlbl 2 "Managers/Admin", modify
        label def occlbl 3 "Sales", modify
        label def occlbl 4 "Clerical/Unskilled", modify
        label def occlbl 5 "Craftsmen", modify
        label def occlbl 6 "Operatives", modify
        label def occlbl 7 "Transport", modify
        label def occlbl 8 "Laborers", modify
        
        bys race (_freq occ): egen pct= total(_freq)
        by race: replace pct= (_freq/ pct)*100
        by race: gen tag= _n>=_N-2
        graph hbar pct if tag, nofill over(occupation, sort(1)) over(race, sort(1) label(angle(vertical))) ytitle(Percent) blab(total, format("%3.0f")) ylab("")
        Click image for larger version

Name:	Graph.png
Views:	1
Size:	35.0 KB
ID:	1744162

        Last edited by Andrew Musau; 22 Feb 2024, 05:28.

        Comment

        Working...
        X