Summary statistics by group

Jordan Driffill

Join Date: Mar 2015

Posts: 27
#1

Summary statistics by group

15 Jun 2015, 17:46

I am trying to get summary statistics for my data by group. So I want statistics on number of observations, the mean and standard deviation by the following groups; tall, not tall, obese, not obese. I have been able to do this by clicking statistics>summaries tables and tests> summary and descriptive stats> summary stats and then using by: tall, not tall, obese, not obese. I know you can do this using a code I just find it easier this way. The problem I have is that I get 4 separate tables and I would like my results in one table. Is this a problem for stata? or is it just a matter of combining the tables when I copy them onto Microsoft word somehow?

Thanks in advance for any help
Jordan
Tags: None
ben earnhart

Join Date: May 2014

Posts: 1027
#2

15 Jun 2015, 18:26

say your obesity variable is called "status." Then a quick and easy way to get what you want, for a single variable, is:

Code:

tab status, sum(othervar)
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4438
#3

15 Jun 2015, 18:55

your data setup is completely unclear; let me guess, however, that height and weight are two (or more) different variables; use egen with the group function to make them into one variable and then use tab (as in #2) or tabstat or table, etc. to get the statistics you want in one table
Comment
Oskar Solbraekke

Join Date: Mar 2017

Posts: 10
#4

03 Dec 2017, 13:12

Hi,

I have a very similar question. I wish to view the detailed descriptive statistics (in particular, averages of each percentile) of each group in my dataset.

The code provided in this thread (tab status, sum( othervar )) provides me with the mean and SD of each group, but I am unable to get the detailed descriptive statistics for each group. Any suggestions on how to do this?

Best regards,
Oskar
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4438
#5

03 Dec 2017, 13:15

you don't say anything about how many groups you have or in what type of variable or how many variables you want descriptive statistics for; given that, I suggest

Code:

help levelsof

and look at the examples, especially the examples using -foreach-
Comment
Oskar Solbraekke

Join Date: Mar 2017

Posts: 10
#6

03 Dec 2017, 13:20

There are 29 groups (Countries). The variable of interest is return on invested capital. I want detailed descriptive statistics only for the return on invested capital.

I will take a look at levelsof and foreach.
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 29953

03 Dec 2017, 14:08

One way to do it is:

Code:

capture program drop my_summarize
program define my_summarize
    local statistics N mean sd min max p1 p5 p10 p25 p50 p75 p90 p95 p99
    summ return_on_invested_capital, detail
    foreach s of local statistics {
        gen `s' = r(`s') in 1
    }
    keep in 1
    keep country `statistics'
    exit
end

runby my_summarize, by(country) verbose

browse

You will need to install Robert Picard and my -runby- from SSC to use this.

Comment

Oskar Solbraekke

Join Date: Mar 2017

Posts: 10
#8

03 Dec 2017, 14:46

Wow. Thank you so much Clyde, that was exactly what I was looking for. The world needs more people like you.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35429
#9

03 Dec 2017, 17:52

See also tabstat
2 likes
Comment
Junran Cao

Join Date: May 2019

Posts: 75
#10

31 Mar 2020, 06:45

Dear Professor Schechter

I, too, find your program below very useful.

Can I please ask if it is possible to generalise it for more than 1 variable? That is, in the slightly modified code below, if we can include variable_2, etc. in the program itself?

Thanks.

Originally posted by Clyde Schechter View Post

One way to do it is:

Code:

capture program drop my_summarize program define my_summarize local statistics N mean sd min max p1 p5 p10 p25 p50 p75 p90 p95 p99 summ variable_1, detail foreach s of local statistics { gen `s' = r(`s') in 1 } keep in 1 keep group `statistics' exit end runby my_summarize, by(country) verbose browse

You will need to install Robert Picard and my -runby- from SSC to use this.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35429
#11

31 Mar 2020, 06:56

#10 collapse already offers that functionality.
1 like
Comment
Junran Cao

Join Date: May 2019

Posts: 75
#12

31 Mar 2020, 16:51

Oh I see. Thank you Nick!
Comment
Marta ArespaC

Join Date: Jun 2021

Posts: 11
#13

29 Nov 2021, 03:12

Dear all,
I needed (and managed with your help in these posts) to compute the variance for the first 12 observations by id in my panel. I get the variances in a table (Nick's solution) or in a new database file (Clyde's solution).
Now, I would like to use these per-id variances as "first observation" of a new variable in the original panel dataset (in long form):
-The first 11 observations per panel should be missing,
- the 12th, the corresponding variance for every id
- from the 13th observation in every panel, the new variable should be a formula. To be more precise:

newvar= (1-lambda)*var1[_n-1]+lambda*newvar[_n-1]

How could I construct this?

Note: what I actually need is to create two variables with a value for volatility in ewma form and in GARCH form, for every period (not the series filtered with tssmooth exponential or a GARCH model for a single series). I wanted to reproduce the equations explained in this link, which are:

ewma_variance=(1-lambda)*squaredreturns[_n-1]+lambda*ewma_variance[_n-1]
garch_variance=omega+alpha*squaredreturns[_n-1]+lambda*garch_variance[_n-1]

.
If Stata has an automatic way to do it, I would be so thankful to know it too.

Thanks in advance for your help.

Last edited by Marta ArespaC; 29 Nov 2021, 03:14.
Comment
Louis Steyn

Join Date: Nov 2023

Posts: 1
#14

19 Nov 2023, 07:10

Dear all,
I have a panel Dataset with T=2 for which I want to display simply the means of the control variables by treatment group (1 or 0) and the pvalue for the difference in means. Is there is simple command which allows me to do this?

Thankyou and best wishes,

Louis
Comment

Announcement

Summary statistics by group

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment