Quick question about -mean- command confidence intervals

Thomas Robert

Join Date: Apr 2023

Posts: 9
#1

Quick question about -mean- command confidence intervals

26 Feb 2024, 14:34

Hello everyone,

I have a quick (hopefully easy) question. When I use the -mean- command in Stata, it produces confidence intervals that do not match my results when I calculate these manually. Below is an example. I will use the auto.dta file in Stata so my results can be replicated.

Code:

sysuse auto.dta

I will use the variable "price," and using the -mean- command, I calculate 99% confidence intervals. The results are, N = 74, Mean = 6165.257, Std. Err. = 342.8719, Lower = 5258.405, Upper = 7072.108

Code:

mean price, level(99)

To calculate this manually, I save the results of the mean and the std. err. as macros that I call "mean" and "std_err":

Code:

matrix list e(b) mat b = e(b) global mean = b[1,1] dis $mean

Stata calculates the standard error of the mean as the square root of the variance (pg.6, https://www.stata.com/manuals13/rmean.pdf). Thus, I use the saved variance from the -mean- command stored in vector V.

Code:

matrix list e(V) mat V = e(V) global variance = V[1,1] global std_err = sqrt($variance) dis $std_err

Now that I have the mean and std. err. saved as precise values from the command, I use the confidence interval calculation: mean +/- (t-ratio * standard error). I looked up the t-ratio for 74 degrees of freedom, and found 2.644. I save the results in global macros called "lower_ci" and "upper_ci" for the lower and upper bounds.

Code:

global lower_ci = $mean - (2.644 * $std_err) dis $lower_ci global upper_ci = $mean + (2.644 * $std_err) dis $upper_ci

As my manual results show, I get a lower CI of 5258.7034, but this is different from the lower CI reported using the -mean- command, which is 5258.405. Likewise, my manual result for the upper CI is 7071.8101, but the one reported using -mean- is 7072.108. The results are close, but not exact. Does anyone know why this is? Additionally, does anyone know how I can use my manual method to get exact results to match the -mean- command in Stata?

Thanks!
Tags: confidence intervals, mean command
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

26 Feb 2024, 14:54

Rounding error. The t-ratio you are using is 2.644, which is a rounded value of the value that Stata is using: 2.6448688. (In fact, it is actually incorrectly rounded: it should be 2.645.) If you use the unrounded value, you will get the same results that -mean- shows you. (You can find the t-ratio Stata uses in r(table)["crit", "price"].)
Comment
Thomas Robert

Join Date: Apr 2023

Posts: 9
#3

26 Feb 2024, 15:58

Hi Clyde,

Thanks so much for the quick response! I had a feeling it was due to rounding.

Another quick follow up, when I look at r(table), it shows degrees of freedom as 73, not 74 (the total number of observations). When finding a critical value, are we supposed to use n-1 for the degrees of freedom?

Thanks!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#4

26 Feb 2024, 17:20

Yes, you should use n-1 degrees of freedom. Whenever a variance is calculated on n observations and the mean is itself calculated from those n observations, you lose 1 df for calculating that mean, hence n-1. If the variance were calculated using a "known" mean exogenously obtained, then you would use n df.

In the present situation, it makes little difference because the critical two-sided value for a 1% critical region is nearly the same for 73 or 74 df.
1 like
Comment
Thomas Robert

Join Date: Apr 2023

Posts: 9
#5

28 Feb 2024, 16:50

Hi Clyde,

That is really helpful, thank you! I really appreciate your quick responses.

Thanks,
Thomas
Comment

Announcement

Quick question about -mean- command confidence intervals

Comment

Comment

Comment

Comment