Can tabstat or any other command report row means easily

Chen Samulsion

Join Date: Jan 2018
Posts: 921

Can tabstat or any other command report row means easily

07 Dec 2021, 18:15

Dear Stata users,

I usually use tabstat to get summary statistics of varlists. It performs very well. However, sometimes I also want to get row mean of varlists besides means of separate variables. So, (without resorting to egen, mean() function and generating a new variable) is there any command could do this?

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(gnp1970 gnp1980 gnp1990) byte quarter
  3922   5310 6830.4 1
3922.3 5190.1 6853.2 2
3961.3 5179.2 6837.5 3
  3931 5253.7 6804.6 4
4027.3 5330.3 6747.1 1
  4042 5306.6 6766.9 2
4064.7 5341.8 6781.2 3
4072.9 5286.2   6821 4
4149.8 5206.1 6883.4 1
4232.5 5236.8 6937.5 2
4272.8 5212.8 6980.2 3
4343.9 5221.7 7067.7 4
4439.6   5255 7065.9 1
4475.9 5365.6 7096.8 2
4471.4 5448.3 7118.6 3
4495.1 5540.5   7213 4
4475.4 5641.4 7278.2 1
4492.4 5707.5 7369.2 2
4442.4 5749.5 7403.2 3
4430.2 5787.3 7494.8 4
4361.7 5818.1   7519 1
4403.2 5870.3 7531.4 2
4471.7 5954.9 7572.7 3
4531.8 5996.7 7645.4 4
4620.6 6038.3 7703.3 1
4655.6 6042.6 7819.6 2
  4677 6097.5 7853.8 3
4720.7   6126 7948.2 4
4791.9 6157.2 8024.3 1
4852.2   6221 8148.8 2
4924.3 6266.3 8233.2 3
4922.2   6372 8289.6 4
4958.8 6423.8 8432.1 1
5107.7 6485.4 8476.3 2
5155.2 6504.8   8560 3
5227.4   6581 8731.6 4
5236.7   6668 8843.8 1
5244.9 6691.3 8910.8 2
5280.1 6727.1 9031.1 3
5296.6 6748.5 9204.7 4
end

Code:

. tabstat gnp1970 gnp1980 gnp1990, by(quarter)

Summary statistics: mean
  by categories of: quarter 

 quarter |   gnp1970   gnp1980   gnp1990  rowmeanof70/80/90
---------+------------------------------
       1 |   4498.38   5784.82   7532.75  5938.65
       2 |   4542.87   5811.72   7591.05  5981.88
       3 |   4572.09   5848.22   7637.15  6019.15
       4 |   4597.18   5891.36   7722.06  6070.20
---------+------------------------------
   Total |   4552.63   5834.03  7620.752
----------------------------------------

Tags: None

Alan Neustadtl

Join Date: Mar 2014
Posts: 107

07 Dec 2021, 19:25

How about something like this:

Code:

gen id=_n
reshape long gnp, i(id) j(year)

gen gnpmean=.
foreach num of numlist 1 2 3 4 {
  summ gnp if quarter==`num', meanonly
  replace gnpmean=r(mean) if quarter==`num'
}

reshape wide
tabstat gnp*, by(quarter)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#3

07 Dec 2021, 19:36

The code in #2 will work, and if the data set isn't very large, the execution time required for the -reshape-s will be acceptable. But there is no need to slow things down still more by using a loop over quarters to get the gnp mean variable.

Code:

gen id=_n reshape long gnp, i(id) j(year) by quarter, sort: egen gnpmean = mean(gnp) reshape wide tabstat gnp*, by(quarter)

That said, I'm imagining that Chen Samulsion was hoping for something a bit more quick, direct, and simple. In fact, my guess is that he would prefer using -egen, rowmean()- and avoiding two -reshape-s to this approach. I think he was hoping to avoid creating any new variable, having the table-writing command handle it internally. Off hand, I don't know any way to do that. But there are several user-written commands for making tables, and it is likely that one of them can do it. Perhaps somebody familiar with one will see this thread and respond.
1 like
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 921
#4

07 Dec 2021, 21:15

Alan Neustadtl Clyde Schechter thank you very much.

That said, I'm imagining that Chen Samulsion was hoping for something a bit more quick, direct, and simple. In fact, my guess is that he would prefer using -egen, rowmean()- and avoiding two -reshape-s to this approach. I think he was hoping to avoid creating any new variable, having the table-writing command handle it internally.

Clyde make it clear what I thought. Perhaps I should not be so lazy in doing non-routine work (always reap without sowing in Stata?). All in all, I learn much from both of you.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#5

08 Dec 2021, 05:30

No one has quite said this yet but the data example in #1 shows a layout that for most Stata purposes should be reshaped to long and kept that way. That layout has very few advantages and many disadvantages.
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 921
#6

08 Dec 2021, 06:08

The data I showed in #1 was transformed from shipped dataset gnp96 which is indeed organized in long form.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#7

08 Dec 2021, 06:22

I don't quite follow #6. If you're saying that you created a simple example to show the problem, then that makes sense and indeed is helpful.
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 921
#8

08 Dec 2021, 06:35

Yes, Nick. I created data example from gnp96, aiming to show my problem. The original gnp96.dta is organized in long form. As you said, long form is the best choice in this case, and perhaps that is why Stata shipped it herewith.
Comment

Announcement

Can tabstat or any other command report row means easily

Comment

Comment

Comment

Comment

Comment

Comment

Comment