Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Table of descriptive data. Possible to switch rows and columns using tabstat?

    Hello,
    In my dataset I have 3 groups (1, 2, 3) and 20 continous variables (A, B, C, etc. ) for each of the groups. I'd like to make a table that displays e.g. the mean value (and sd) for each of the variables, in each of the groups. The tabstat function gives me the statistics by group, but the groups are organized as rows, which makes the table very wide (and divided when it gets very wide). Is there a way to make tabstat work the opposite way, e.g. so that the groups are organized as rows, and the variables as rows? Or another way to make a table like the one below?
    Variables Gr.1 Gr.2 Gr. 3
    A mean (sd) of A mean, (sd) of A mean, (sd) of A
    B mean, (sd( of B etc.
    C
    etc
    Thanks,
    Ingrid

  • #2
    Here are two variations that are possible using the tabstat command.


    Code:
     tabstat invest, statistics( mean sd ) by(company)
    
    Summary for variables: invest
         by categories of: company
    
     company |      mean        sd
    ---------+--------------------
           1 |    608.02  309.5746
           2 |   410.475  125.3994
           3 |    102.29   48.5845
           4 |   86.1235  42.72555
           5 |   61.8025  15.16693
           6 |    55.411  34.94722
           7 |   47.5955  18.31569
           8 |   42.8915  19.11019
           9 |    41.889  14.88117
          10 |    3.0845  1.718661
    ---------+--------------------
       Total |  145.9583  216.8753
    ------------------------------
    
    . tabstat invest, statistics( mean sd ) by(company) columns(variables)
    
    Summary statistics: mean, sd
      by categories of: company
    
     company |    invest
    ---------+----------
           1 |    608.02
             |  309.5746
    ---------+----------
           2 |   410.475
             |  125.3994
    ---------+----------
           3 |    102.29
             |   48.5845
    ---------+----------
           4 |   86.1235
             |  42.72555
    ---------+----------
           5 |   61.8025
             |  15.16693
    ---------+----------
           6 |    55.411
             |  34.94722
    ---------+----------
           7 |   47.5955
             |  18.31569
    ---------+----------
           8 |   42.8915
             |  19.11019
    ---------+----------
           9 |    41.889
             |  14.88117
    ---------+----------
          10 |    3.0845
             |  1.718661
    ---------+----------
       Total |  145.9583
             |  216.8753
    --------------------
    asdoc (available on the SSC) can export the table to a Word file in a slightly better format.
    Code:
    ssc install asdoc
    asdoc tabstat invest mvalue kstock , statistics( mean sd ) by(company)  replace
    Click image for larger version

Name:	Capture.JPG
Views:	1
Size:	77.4 KB
ID:	1592486
    Last edited by Attaullah Shah; 02 Feb 2021, 03:06.
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    FinTechProfessor.com
    https://asdocx.com
    Check out my asdoc program, which sends outputs to MS Word.
    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

    Comment


    • #3
      The columns() option allows transposing the table produced by the tabstat command (strictly, in Stata terms, not a function).
      Last edited by Nick Cox; 02 Feb 2021, 03:06.

      Comment


      • #4
        Thank you, Mr. Shah and Mr. Cox.
        tabstat is of course a command -not a function.

        When using the columns() option, it seems one can choose only between statistics or variables. However, in this case I would like to have the groups defined as the columns next to each other, and the variables as rows. Stata won't accept that: "option columns(groups) invalid -- specify columns(variables) or columns(statistics)".

        In your example, Mr. Shah, that would mean I would want the different companies as columns next to each other (for comparison). Perhaps one would need to use another command than tabstat?
        However, I will install the asdoc.

        Comment


        • #5
          This may help too. The point is that sometimes a bit of work to get a different data structure makes a table easier.

          Code:
          sysuse auto, clear
          
          foreach v in price mpg foreign { 
              local `v' : var label `v' 
              if "``v''" == "" local `v' "`v'"
          }
          stack price foreign mpg foreign, into(value foreign)
          
          label def _stack 1 "`price'" 2 "`mpg'"
          label val _stack _stack
          label var _stack " "
          
          label def foreign 0 Domestic 1 Foreign
          label val foreign foreign
          
          table _stack foreign, c(mean value sd value) format(%2.1f)
          
          ----------------------------------
                        |      foreign      
                        | Domestic   Foreign
          --------------+-------------------
                  Price |   6072.4    6384.7
                        |   3097.1    2621.9
                        | 
          Mileage (mpg) |     19.8      24.8
                        |      4.7       6.6
          ----------------------------------

          Comment


          • #6
            Wow. This seems to be what I need. I will have to look into this. I definitely need to learn some programming in Stata. Thank you!

            Comment


            • #7
              Here's a quick hack at a generalisation.

              Code:
              *! 1.0.0 NJC 4 Feb 2021 
              program variable_group_table 
                  version 8 
                  syntax varlist(numeric) [if] [in], by(varlist) STATs(string) ///
                  [TRANSpose *] 
                  
                  * data to use 
                  preserve 
                  if `'"`if'`in'"' != "" quietly keep `if' `in'
                  marksample touse 
                  markout `touse' `by', strok 
                  quietly count if `touse'
                  if r(N) == 0 error 2000  
                  
                  * copy group variable value labels if there are any 
                  local labelname : value label `by'
                  quietly if "`labelname'" != "" { 
                      levelsof `by', local(bylevels)
                      foreach x of local bylevels { 
                          local labels `labels' `x'  `"`: label `labelname' `x''"' 
                      }
                  }
              
                  * prepare call to -stack- while preserving variable labels or names 
                  local i = 0 
                  foreach v in `varlist' { 
                      local ++i 
                      local `v' : var label `v' 
                      if "``v''" == "" local `v' "`v'"
                      local label`i' "``v''" 
                      local call `call' `v' `by'
                  }
                  
                  * stack 
                  tempname value 
                  stack `call', into(`value' `by') clear 
                  
                  * now use variable labels (names) as new value labels 
                  local I = `i' 
                  forval i = 1/`I' { 
                      label def _stack `i'  "`label`i''", modify  
                  }
                  label val _stack _stack
                  
                  * blank out row and column variables 
                  label var _stack " "
                  label var `value' " "
                  
                  * copy group value labels if there were any 
                  if "`labelname'" != "" { 
                      label def `by' `labels'
                      label val `by' `by'
                  }
              
                  * prepare table call 
                  local call 
                  foreach s of local stats {
                      local call `call' `s' `value'
                  } 
              
                  * the table! 
                  if "`transpose'" != "" table `by' _stack, c(`call') `options' 
                  else table _stack `by', c(`call') `options' 
              end 
              
              
              . sysuse auto , clear
              (1978 Automobile Data)
              
              . variable_group_table mpg price, by(foreign) stat(mean sd)
              
              ----------------------------------
                            |      foreign      
                            | Domestic   Foreign
              --------------+-------------------
              Mileage (mpg) |  19.8269   24.7727
                            | 4.743297  6.611187
                            | 
                      Price |  6072.42   6384.68
                            | 3097.104  2621.915
              ----------------------------------
              
              . variable_group_table mpg price, by(rep78) stat(mean sd)
              
              ----------------------------------------------------------------
                            |                      rep78                      
                            |        1         2         3         4         5
              --------------+-------------------------------------------------
              Mileage (mpg) |       21    19.125   19.4333   21.6667   27.3636
                            |  4.24264  3.758324  4.141325   4.93487  8.732385
                            | 
                      Price |   4564.5   5967.63   6429.23    6071.5      5913
                            | 522.5519  3579.357   3525.14  1709.608  2615.763
              ----------------------------------------------------------------
              
              . variable_group_table mpg price, by(rep78) stat(p25 p50 p75)
              
              ------------------------------------------------------
                            |                 rep78                 
                            |      1       2       3       4       5
              --------------+---------------------------------------
              Mileage (mpg) |     18    16.5      17      18      18
                            |     21      18      19    22.5      30
                            |     24      23      21      25      35
                            | 
                      Price |   4195    4035    4187    4697    3984
                            | 4564.5    4638    4741  5751.5    5397
                            |   4934    6114    6295    7140    5899
              ------------------------------------------------------
              
              . variable_group_table mpg price, by(rep78) stat(p25 p50 p75) trans
              
              ----------------------------------------
                  rep78 | Mileage (mpg)          Price
              ----------+-----------------------------
                      1 |            18           4195
                        |            21         4564.5
                        |            24           4934
                        | 
                      2 |          16.5           4035
                        |            18           4638
                        |            23           6114
                        | 
                      3 |            17           4187
                        |            19           4741
                        |            21           6295
                        | 
                      4 |            18           4697
                        |          22.5         5751.5
                        |            25           7140
                        | 
                      5 |            18           3984
                        |            30           5397
                        |            35           5899
              ----------------------------------------

              Comment


              • #8
                Some small fixes

                Code:
                *! 1.0.1 NJC 4 Feb 2021 
                program variable_group_table 
                    version 8 
                    syntax varlist(numeric) [if] [in], by(varlist) STATs(string) ///
                    [TRANSpose *] 
                    
                    * data to use 
                    preserve 
                    if `"`if'`in'"' != "" quietly keep `if' `in'
                    marksample touse 
                    markout `touse' `by', strok 
                    quietly count if `touse'
                    if r(N) == 0 error 2000  
                    
                    * copy group variable value labels if there are any 
                    local labelname : value label `by'
                    quietly if "`labelname'" != "" { 
                        levelsof `by', local(bylevels)
                        foreach x of local bylevels { 
                            local labels `labels' `x'  `"`: label `labelname' `x''"' 
                        }
                    }
                
                    * prepare call to -stack- while preserving variable labels or names 
                    local i = 0 
                    foreach v in `varlist' { 
                        local ++i 
                        local `v' : var label `v' 
                        if "``v''" == "" local `v' "`v'"
                        local label`i' "``v''" 
                        local call `call' `v' `by'
                    }
                    local bylabel : var label `by' 
                    if `"`bylabel'"' == "" local bylabel "`by'" 
                    
                    * stack 
                    tempname value 
                    stack `call', into(`value' `by') clear 
                    
                    * now use variable labels (names) as new value labels 
                    local I = `i' 
                    forval i = 1/`I' { 
                        label def _stack `i'  "`label`i''", modify  
                    }
                    label val _stack _stack
                    
                    * blank out row and column variables 
                    label var _stack " "
                    label var `value' " "
                    
                    * copy group value labels if there were any 
                    if "`labelname'" != "" { 
                        label def `by' `labels'
                        label val `by' `by'
                    }
                
                    label var `by' `"`bylabel'"' 
                
                    * prepare table call 
                    local call 
                    foreach s of local stats {
                        local call `call' `s' `value'
                    } 
                
                    * the table! 
                    if "`transpose'" != "" table `by' _stack, c(`call') `options' 
                    else table _stack `by', c(`call') `options' 
                end

                Comment


                • #9
                  Thanks a lot for the quick hack :-)
                  With your additional explanations I am hoping to be able to figure it out for my own data.

                  Comment


                  • #10
                    Dear Nick,

                    This is exactly what I have been looking for. I have two questions:

                    1) Where one should usually put such a long code or quick fixes we find on Statalist? On the do-file where one works OR there is another way (I am thinking at installing a package, one writes one command and it is there, installed)?
                    2) I noticed that the code you provided does not allow putting the coefficients (mean, N) on separate columns but one below each other. Is there a way to do this?

                    Many, many thanks for providing with this quick hack!

                    Comment


                    • #11
                      1)

                      Code:
                      adopath
                      tell you where PLUS is. You need to put in the subdirectory or folder v off that. Or off of that, as some people say. In this case, there is no help file.

                      2)

                      The program is really just a wrapper for
                      table and I don't recall a hook for that. But output in columns is something you can get otherwise, e.g.

                      Code:
                      . sysuse auto, clear
                      (1978 Automobile Data)
                      
                      . collapse (sd) sdmpg = mpg sdprice = price (mean) meanmpg=mpg meanprice=price, by(rep78)
                      
                      
                      . tabdisp rep78, c(meanmpg sdmpg meanprice sdprice) format(%2.1f)
                      
                      ------------------------------------------------------------------
                      Repair    |
                      Record    |
                      1978      |   (mean) mpg      (sd) mpg  (mean) price    (sd) price
                      ----------+-------------------------------------------------------
                              1 |         21.0           4.2        4564.5         522.6
                              2 |         19.1           3.8        5967.6        3579.4
                              3 |         19.4           4.1        6429.2        3525.1
                              4 |         21.7           4.9        6071.5        1709.6
                              5 |         27.4           8.7        5913.0        2615.8
                              . |         21.4           5.1        6430.4        3804.3
                      ------------------------------------------------------------------
                      Code:
                      
                      


                      https://www.stata-journal.com/articl...article=pr0053 may be of interest.

                      Comment


                      • #12
                        Nominating Dr. Cox's hack for inclusion in Stata 17!

                        Comment


                        • #13
                          #11 collapse is destructive, but this isn't:


                          Code:
                          . sysuse auto, clear 
                          (1978 Automobile Data)
                          
                          . * ssc inst rangestat 
                          . rangestat (sd) sdmpg = mpg sdprice = price (mean) meanmpg=mpg meanprice=price, int(rep78 0 0) 
                          
                          . 
                          . tabdisp rep78 if rep78 < ., c(meanmpg sdmpg meanprice sdprice) format(%2.1f)
                          
                          ----------------------------------------------------------------------
                          Repair    |
                          Record    |
                          1978      |   mean of mpg      sd of mpg  mean of price    sd of price
                          ----------+-----------------------------------------------------------
                                  1 |          21.0            4.2         4564.5          522.6
                                  2 |          19.1            3.8         5967.6         3579.4
                                  3 |          19.4            4.1         6429.2         3525.1
                                  4 |          21.7            4.9         6071.5         1709.6
                                  5 |          27.4            8.7         5913.0         2615.8
                          ----------------------------------------------------------------------

                          Comment


                          • #14
                            Thank you, Dr. Cox, for sharing the generalized code. When I run the program and attempt to generate the table it results in the following error:

                            Click image for larger version

Name:	Picture1.png
Views:	1
Size:	14.5 KB
ID:	1715102


                            Any ideas on how to fix this?

                            Comment


                            • #15
                              That works for me in Stata 18. My initial fear was that this program had been broken by the new table in Stata 17 and 18. Not so: under version control it should still work.

                              Are you using a version of Stata before 18?

                              Code:
                              . variable_group_table mpg price, by(foreign) stat(mean sd)
                              
                              ----------------------------------
                                            |     Car origin    
                                            | Domestic   Foreign
                              --------------+-------------------
                              Mileage (mpg) |  19.8269   24.7727
                                            | 4.743297  6.611187
                                            | 
                                      Price |  6072.42   6384.68
                                            | 3097.104  2621.915
                              ----------------------------------
                              
                              
                              .

                              Comment

                              Working...
                              X