SE or SD of the mean for descriptive statistics

Meng Yu

Join Date: Feb 2018

Posts: 169
#1

SE or SD of the mean for descriptive statistics

06 Jul 2021, 11:43

I am preparing a descriptive statistics table. When using the command summary I can obtain the standard deviation of the mean. But I have seen some papers provide SE for the mean. I wonder which is more appropriate and how to obtain SE for the mean. Thank you.
Tags: None
Ken Chui

Join Date: Aug 2014

Posts: 1057
#2

06 Jul 2021, 13:45

It can be obtained using -tabstat-:

Code:

sysuse auto, clear tabstat mpg weight length, stat(semean)

As for which one is better, my answer is "the well labeled one is better." As long as it's clearly labeled, either one can show data dispersion.

In practice, I often work on rather big data, which cause the SE to be too small, thus I generally prefer SD if it's available (in some setting, like data with complex survey weight, it may be difficult to get SD.) And If I had to show SE, I usually just go one more step to show confidence intervals, which are perhaps easier to interpret.

It can also be field-dependent. Identify a few peer-reviewed articles or journals from your field and use them as a guideline.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#3

06 Jul 2021, 21:04

Thank you very much for your detailed answering. I really appreciate it. I have large data and someone in my field using the same data used standard deviation. I guess I will use standard deviation too although I am using panel data with complex survey weight. I wonder how I am going to apply the weight and survey wave in the same command. I only need to take the mean of one variable. So I guess the syntax will be like:

Code:

sum variable if wave==1, pweight=weightvariable

Is that correct?

The rest of my variables are categorical. I wonder if I could do something like:

Code:

tabstat X1 X2 X3, by (wave) rows (variables) pweight=weightvariable

Thank you.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#4

07 Jul 2021, 12:23

For the first command,

Code:

sum variable if wave==1 [iweight=weightvar]

worked. For the second syntax, option rows not allowed.

Last edited by Meng Yu; 07 Jul 2021, 12:31.
Comment
Ken Chui

Join Date: Aug 2014

Posts: 1057
#5

07 Jul 2021, 12:31

What does the data source's documentation say? If you're using complex survey setting (svyset), then there are -svy: mean- and -svy: table- you may use.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#6

07 Jul 2021, 12:48

neither code in #3 will work as the weight goes before the comma; see the help files:

Code:

help summ help tabstat

also, my 2 cents: SE is an inferential, not a descriptive stat; SD is a descriptive stat (but custom may differ in your field)
1 like
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#7

07 Jul 2021, 13:13

Thank you both. I agree SE is inferential. I use xtset to set my data. Will try svyset to see if it works with pweight.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#8

08 Jul 2021, 21:38

I found this article and the quote in it. It seems I can use summarize with aweight.
https://www.stata.com/support/faqs/s...ry-statistics/

First, let me show that summarize with aweights gives the same result as estat sd
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#9

09 Jul 2021, 01:07

If you have a variable X, the standard deviation of X is the standard deviation of X. The standard error is the standard deviation of mean(X). Therefore:
1. If you are interested in X, rather than mean(X) you should report standard deviation, and not standard errors.
2. Standard error = (Stanard Deviation)/sqrt(Sample Size), and therefore Standard error converges to 0 as the sample size grows without bound. I do not find very interesting objects which converge to 0 as a description of variables/the population. (They are interesting for inference purposes for the given sample, as it was mentioned above.)

You might find the following FAQ useful, it discusses weights and svy jointly: https://www.stata.com/support/faqs/s...ry-statistics/
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#10

09 Jul 2021, 02:20

Meng:
as an aside to previous excellent replies, the standard deviation does not make any assumptions about the theoretical probability distribution the sample under investigation comes from: it simply describes the dispesion of your data around a measure of central tendency (the mean).
The standard error can be read as the standard deviation of the sample distribution of the mean; as such, it implies a reference theoretical probability distribution. Having to do with parameters, it is an inferential tool.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#11

09 Jul 2021, 08:15

Thank you both. I think the link Joro sent was the same as the one I sent. It is also about using sum with aweight gives you the same result as using svy:mean and estat sd.
Comment

Announcement

SE or SD of the mean for descriptive statistics

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment