Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a bar chart for two categorical variables with error bars

    Hello everybody,

    I am trying to create a bar chart of frequencies for a categorical variable, ethnicity, in which I want to see subgroups for another categorical variable, property ownership type. The graph should be showing the frequency of each property in the ethnic groups (ie: among ethnic group 1, 60% own, 35% rent, ...), and thus having a sum of frequency equal to 100% in each ethnic group. I want something that looks like this:
    Click image for larger version

Name:	Capture d’écran 2024-05-29 153816.jpg
Views:	1
Size:	91.3 KB
ID:	1754842




    I tried to do that with the following code, but I don't end up with what I want:

    Code:
    gen propertyethnicity = .
    replace propertyethnicity = property_ownership if p04a == 1
    replace propertyethnicity = property_ownership + 5  if p04a == 2
    replace propertyethnicity = property_ownership + 10  if p04a == 3
    replace propertyethnicity = property_ownership + 15  if p04a == 4
    replace propertyethnicity = property_ownership + 20  if p04a == 5
    replace propertyethnicity = property_ownership + 25  if p04a == 6
    replace propertyethnicity = property_ownership + 30  if p04a == 7
    list propertyethnicity property_ownership p04a, sepby(p04a)
    bysort p04a property_ownership: egen freq = count(p04a)
    bysort p04a: egen total_count = total(p04a)
    
    gen percent = (freq / total_count) * 100
    gen se = sqrt((percent/100) * (1 - percent/100) / total_count) * 100
    gen hi = percent + invttail(total_count-1,0.025) * se
    gen low = percent - invttail(total_count-1,0.025) * se
    
    twoway (bar percent propertyethnicity if property_ownership==1)(bar percent property_ownership==2)(bar percent propertyethnicity if property_ownership==3)(bar percent propertyethnicity if property_ownership==4)(rcap hi low propertyethnicity), legend(row(1) order(1 "Owning" 2 "Renting" 3 "Family" 4 "Other" 5 "Gurma" 6 "Mole-Dagbani" 7 "Other ethnicities") ) xlabel(
    5 "Akan" 10 "Ga-Dangme" 15 "Ewe" 20 "Guan" 25 "Gurma" 30 "Mole-Dagbani" 35 "Other ethnicities", noticks) ytitle("Frequency of property ownership type [%]")

    I followed this "tutorial" but I can't seem to get it right: https://stats.oarc.ucla.edu/stata/fa...th-error-bars/

    I think the main issue may be coming from me trying to input error bars, as this would be easy to do without it.

    If someone could help me find my mistakes, I would be very grateful. I tried my best explaining but I'm new to this.
Working...
X