Generating upper and lower bounds of confidence intervals

Eli Omo

Join Date: Aug 2021

Posts: 40
#1

Generating upper and lower bounds of confidence intervals

21 Feb 2022, 08:03

Hey everyone,

I am trying to generate the confidence interval of the mean variable that is plotted on my two way scatter graph.

This is the code for the two way scatter graph

Code:

graph twoway (scatter fidelityperweek bpweeks), by (ETHNICITY)

This is what I have currently done to try and generate the lower and upper bounds:

Code:

gen yu = fidelityperweek + 1.96*se_fidelityperweek

Code:

gen yl = fidelityperweek - 1.96*se_fidelityperweek

I get this as an error reply :
se_fidelityperweek not found

I am using STATA 16.1

Thanks for any help in advance
Tags: categorical, data, graph
Nick Cox

Join Date: Mar 2014

Posts: 35211
#2

21 Feb 2022, 09:10

The immediate problem is that a reference to se_fidelity_week implies to Stata that a variable with that name (or a numeric scalar with that name) exists in your dataset, but -- as you were notified -- Stata can't find such a variable or scalar. Otherwise put, you have to create it.

As in your previous thread https://www.statalist.org/forums/for...vals-to-graphs your question is too vague to answer precisely. You need to back up and explain more about your set-up.

The underlying question can't be answered well without knowing what fidelity is. It is stated to be a "mean variable" but that doesn't give us enough information.

The use of 1.96 times some standard error is a crude approximation based on (1) knowing the standard error (as above) (2) the sampling distribution being normal or Gaussian (3) the sample being large enough for 1.96 being an adequate multiplier (even if (1) and (2) are correct, multipliers larger than 1.96 will be needed for small samples).

Your example is murkier still, as you have a series of scatter plots, but the confidence intervals you want might be (examples follow)

means calculated separately (for what: a proportion? a mean count? a mean measurement?)

OR

something fitted or predicted by an analysis of variance in which bpweek is a discrete predictor

OR

the same, but using a regression in which bpweek is a continuous predictor

OR

the same, but using some nonlinear or nonparametric regression.

As said these are examples.

You've been posting to Statalist intermittently for some months, but it's not working out well for you, mostly I guess because you are ignoring the detailed advice at https://www.statalist.org/forums/help
Comment
Eli Omo

Join Date: Aug 2021

Posts: 40
#3

21 Feb 2022, 10:56

I understand, apologies for not being very clear.

fidelity

: mean of blood pressure readings measured in that week

the graph currently shows fidelity across the different week ( be-weeks) . I want to work out the confidence interval of the mean that is plotted. I hope this is clearer now
Comment
Ken Chui

Join Date: Aug 2014

Posts: 1054
#4

21 Feb 2022, 15:34

Originally posted by Eli Omo View Post

I understand, apologies for not being very clear.

: mean of blood pressure readings measured in that week

the graph currently shows fidelity across the different week ( be-weeks) . I want to work out the confidence interval of the mean that is plotted. I hope this is clearer now

No, it's not clear. Please use -dataex- to show some sample data so that we know what you're working with.

The most important point is we don't know if your data are already the means for each week. Let's say you have 10 weeks, and your data may look like this:

Code:

week_number fidelity 1 2 2 3 3 5 4 7 5 9 6 12 7 17 8 28 9 38 10 44

In that case, you'll have to go back to the original data where each row is each unit of observation, and collapse the data again to get the SD and group size so that you can compute SE.

As said above, please read the FAQ (http://www.statalist.org/forums/help) on how to make the questions easier to answer. In case it's not clear, here are a couple recent questions that show data, and you can see the other users were able to suggest code very quickly:
https://www.statalist.org/forums/for...-between-years

https://www.statalist.org/forums/for...-in-panel-data

Your problem should not be too hard to solve in Stata, but you'd need to put the work into the question to make sure we understand the data and objective first. Good luck.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35211
#5

22 Feb 2022, 06:46

I agree with Ken Chui that the confidence intervals need to be calculated from the original data. I would use statsby and ci in most such cases, but I am reluctant otherwise to throw more time at a guessing game.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#6

22 Feb 2022, 06:58

My only comment here is that depending on how EXACTLY the data were generated, the world of confidence intervals can be pretty varied. Yeah the 1.96 solution is one, but this isn't always the case.

Sometimes, you don't have a very big set of units to pick from, if you're doing matching or synthetic controls. Sometimes you have to (or could defensibly) jackknife or use some manner of bootstrapped CI, sometimes you could do placebo tests based off of draws from your untreated units, sometimes...

Point is, without a strong dataset to work with, or without being able to calculate the results as you did, providing precise statistical advice on a topic that sometimes doesn't have very simple answers is well, not very simple.
Comment

Announcement

Generating upper and lower bounds of confidence intervals

Comment

Comment

Comment

Comment

Comment