How to add confidence intervals to a plot with two different variables over gender?

Alex Izquierdo

Join Date: Jun 2019
Posts: 12

How to add confidence intervals to a plot with two different variables over gender?

16 Jan 2024, 05:15

Dear users,

After a long time looking for a way to do it, I come to you as my last hope. I have a dataset in which students pick other students for their teams. There are two variables of interest: the gender of the individual picking and the gender of the individual pick. I want to plot in a graphic the probability of being selected depending on those two variables (so, the probability that a boy picks a boy and a girl, and the probability a girl picks a boy and a girl, 4 bars in total). I already achieved being able to count how many boys and grils each student picks, and therefore the probability of being picken. This is the graph I get:

Code:

graph bar maleprobmean femaleprobmean, over(female)

Click image for larger version

Name: Gender_selection.png
Views: 1
Size: 40.6 KB
ID: 1739994

Now I want to add the confidence intervals to that graph. I have already computed them with collapse, in the following way:

Code:

collapse (mean) maleprobmean = maleprob_team femaleprobmean = femaleprob_team (sd) sdmaleprob = maleprob_team sdfemaleprob = femaleprob_team (count) nmale = maleprob_team nfemale = femaleprob_team, by(female)

generate himale = maleprobmean + invttail(nmale-1,0.025)*(sdmaleprob / sqrt(nmale))
generate lomale = maleprobmean - invttail(nmale-1,0.025)*(sdmaleprob / sqrt(nmale))

generate hifemale = femaleprobmean + invttail(nfemale-1,0.025)*(sdfemaleprob / sqrt(nfemale))
generate lofemale = femaleprobmean - invttail(nfemale-1,0.025)*(sdfemaleprob / sqrt(nfemale))

And this is how the data looks like:

female	maleprobmean	femaleprobmean	sdmaleprob	sdfemaleprob	nmale	nfemale	himale	lomale	hifemale	lofemale
Male	.2157536	.0935449	.1723743	.129409	854	854	.2273309	.2041763	.1022365	.0848533
Female	.0963259	.2172072	.1362991	.1678487	873	881	.1053798	.087272	.228306	.2061084

The only problem now, is puttin those two things together. I have tried using the graph twoway command, but the result it gives is very weird, since I have not any additional variable to group them, so the two CI and the two bars for each variable of each gender stay on top of each other. I cannot for the sake of me figure out a way to add it, as simple as it may be.
If someone has some idea of how to do it, I would be extremely thankful.

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35698
#2

16 Jan 2024, 06:41

The good news is that plotting confidence intervals is often discussed here and indeed elsewhere (e.g. in the Stata Journal).

Let's back up -- and in the absence of a data example -- use some invented data. I start with the idea that

Code:

There are two variables of interest: the gender of the individual picking and the gender of the individual pick. I want to plot in a graphic the probability of being selected depending on those two variables

I got lost soon after that on what you have done and how you are thinking about this, so I will stick to a simple formulation. If I am misunderstanding your set-up, I am still pointing to what should be relevant commands. You don't need to calculate the confidence intervals yourself.

I see those variables as defining pr(boy picks girl) and pr(girl picks girl) and there are two complementary probabilities, pr(boy picks boy) and pr(girl picks boy), which are as said just complementary. So two means and two confidence intervals summarize the data. I use ideas similar to those in https://journals.sagepub.com/doi/pdf...867X1001000112

Code:

clear set obs 100 set seed 2803 gen picker = runiformint(0, 1) gen picked = runiformint(0, 1) label def female 0 male 1 female label val picker female label val picked female statsby, clear by(picker) : ci proportion picked, jeffreys scatter mean picker, ms(Dh) msize(large) || rcap ub lb picker , xla(0 1, valuelabel noticks) /// xsc(r(-0.2 1.2)) aspect(1) ytitle(proportion picking female) legend(off) subtitle(means and 95% confidence intervals: Jeffreys method)

Naturally there are many variations on this idea. The Jeffreys method just happens to be a personal favourite.

The use of bar charts seems unnecessary here. The point is comparisons of probabilities with each other and say with 0.5, not with zero. See also any thread on the internet against dynamite, detonator or plunger plots.
1 like
Comment
Alex Izquierdo

Join Date: Jun 2019

Posts: 12
#3

17 Jan 2024, 07:48

Hi Nick,

Thank you very much for your idea. I know it had to be simple enough, but for some reason I was focused on bar charts and couldn't get out of it. The point plot looks indeed clearer and makes adding the CI much easier.
Comment

Announcement

How to add confidence intervals to a plot with two different variables over gender?

Comment

Comment