Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Esttab: Combining summary statistics and correlation table

    Hello Stata experts,

    I try to export my summary statistics, the correlation table and my regressions with the esttabs functions. So far, I have figured out how they work and my results look quite appealing. However, there are too minor things, I would like to do.
    1. I was asked to combine the summary statistics and the correlation matrix in one table.
    2. The correlations between the same variables should not be displayed as they are 1 anyways (e.g. Age correlated with Age = 1 and should not be displayed in the output).

    The table below should illustrate what I am trying to do. (Ideally the 1s are also not displayed)
    N Mean SE Var1 Var2 Var3 Var4
    Var1 1000 500 25 1
    Var2 1000 4 1 0.7 1
    Var3 1000 497 12 0.2 0.5 1
    Var4 1000 9345 124 0.08 0.4 0.5 1
    This is my do-file:

    Code:
    ** Set variables used in Summary and Correlation (NOTE: coeflabel has to be adjusted if changes are made)
    local variables [Variable Names]
    
    ** Descriptive statistics
    eststo clear
    estpost summarize `variables'
    esttab using Summary.rtf, ///
            replace ///
            cell((count(fmt(%9.0f) label(Count)) mean(fmt(%9.2f) label(Mean)) sd(fmt(%9.2f) label(Standard Deviation)) min(fmt(%9.2f) label(Min)) max(fmt(%9.2f) label(Max)))) ///
            coeflabel([Variable Names]​​​​​) ///
            nonumbers ///
            title("Summary Statistics")
    eststo clear
    
    ** Correlation matrix
    eststo clear
    estpost correlate `variables', matrix
    eststo corrtr
    esttab using Correlation.rtf, ///
            replace ///
            unstack ///
            not noobs ///
            compress ///
            coeflabel([Variable Names]​​​​​) ///
            nomtitles eqlabels("(1)" "(2)" "(3)" "(4)" "(5)" "(6)" "(7)" "(8)" "(9)" "(10)" "(11)" "(12)") ///
            b(%9.2f) ///
            star (* 0.10 ** 0.05 *** 0.01) ///
            nonumbers ///
            title ("Correlation Table")
    eststo clear

  • #2
    You may find the following example code, based on your but producing just the output shown in your illustrative table, may start you in a useful direction.
    Code:
    sysuse auto, clear
    matrix drop _all
    
    ** Set variables used in Summary and Correlation
    local variables length weight displacement price
    
    ** Descriptive statistics
    estpost summarize `variables'
    matrix table = ( e(count) \ e(mean) \ e(sd) )
    matrix rownames table = count mean sd
    matrix list table
    
    ** Correlation matrix
    correlate `variables'
    matrix C = r(C)
    local k = colsof(C)
    matrix C = C[1..`=`k'-1',.]
    local corr : rownames C
    matrix table = ( table \ C )
    matrix list table
    
    estadd matrix table = table
    
    local cells table[count](fmt(0) label(Count)) table[mean](fmt(2) label(Mean)) table[sd](fmt(2) label(Standard Deviation))
    local drop
    foreach row of local corr {
        local drop `drop' `row'
        local cells `cells' table[`row'](fmt(4) drop(`drop'))
    }
    display "`cells'"
    
    esttab using Report.rtf, ///
            replace ///
            noobs ///
            nonumbers ///
            compress ///
            cells("`cells'")
    Click image for larger version

Name:	Report.png
Views:	1
Size:	197.6 KB
ID:	1497532

    Last edited by William Lisowski; 09 May 2019, 09:32.

    Comment


    • #3
      Hi William,

      This is exactly what I am looking for! Thank you

      However, I have a minor adjustment to make and I was not able to figure out how it works myself. I would like to rename the variables.
      In the rows I want to have: "1 DescVar1" Whereas DescVar1 is not the variable name, but a description I specify. For example instead of "length", I'd prefer "1 Length".
      Columns should be labeled: "Count", "Mean", "Standard Deviation" and then "1", "2", etc in order to match the numbers in the rows.

      So the ideal table, which I should have drafted correctly in the first way, should like the following one:
      N Mean SE 1 2 3 4
      1 DescVar1 1000 500 25
      2 DescVar2 1000 4 1 0.7
      3 DescVar3 1000 497 12 0.2 0.5
      4 DescVar4 1000 9345 124 0.08 0.4 0.5

      Best regards,
      Tobi

      Comment


      • #4
        I will start with some advice to improve your future posts. As you saw, asking for something different than what you want risks getting something you cannot make as you want, and in the worst case, leads to a solution that cannot be made to do what you want, so the real answer requires starting over from the top. (That was not the case this time.) Take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

        The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

        With that said, the following code seems to do what you want, although it could have problems because it is late in my day and I have other things to do now.
        Code:
        sysuse auto, clear
        matrix drop _all
        
        ** Set variables used in Summary and Correlation
        local variables length weight displacement price
        local labels `" "Length (in)" "Weight (lbs)" "Disp (ci)" "Price (USD)" "'
        
        ** Descriptive statistics
        estpost summarize `variables'
        matrix table = ( e(count) \ e(mean) \ e(sd) )
        matrix rownames table = count mean sd
        matrix list table
        
        ** Correlation matrix
        correlate `variables'
        matrix C = r(C)
        local corr : rownames C
        matrix table = ( table \ C )
        matrix list table
        
        estadd matrix table = table
        
        local cells table[count](fmt(0) label(Count)) table[mean](fmt(2) label(Mean)) table[sd](fmt(2) label(Standard Deviation))
        local collab
        local drop
        local i 0
        foreach row of local corr {
            local drop `drop' `row'
            local cells `cells' table[`row']( fmt(4) drop(`drop') label((`++i')) )
            local lbl : word `i' of `labels'
            local collab `" `collab' `row' "(`i') `lbl'" "'
        }
        display "`cells'"
        display `"`collab'"'
        
        esttab using Report.rtf, ///
                replace ///
                noobs ///
                nonumbers ///
                compress ///
                cells("`cells'") ///
                coeflabels(`collab')
        Click image for larger version

Name:	Report.png
Views:	1
Size:	218.8 KB
ID:	1497639

        Comment


        • #5
          Good morning William,

          Thank you for your reply! I'll try to improve my future posts and consider the tips in the FAQ.

          Regarding your code, it works perfectly fine without any issues. Thanks!

          Tobi

          Comment


          • #6
            I have a follow-up question. How would you modify the above code so that it includes significance stars for the correlation part of the table? For instance *** for significant at p < .001, ** for significant at p < .01, and * for p < .05. Also, how would you include the minimum and maximum in the table? Thanks.

            David
            Last edited by David Allsop; 07 Jan 2020, 12:22.

            Comment


            • #7
              Hi,

              In the solution, you used labels that you defined in the local labels. How to use labels of the variables if they are already assigned?

              Mouli

              Comment


              • #8
                Originally posted by David Allsop View Post
                I have a follow-up question. How would you modify the above code so that it includes significance stars for the correlation part of the table? For instance *** for significant at p < .001, ** for significant at p < .01, and * for p < .05. Also, how would you include the minimum and maximum in the table? Thanks.

                David
                William Lisowski Would also be interested to know this.

                Comment


                • #9
                  I would like to combine the descriptive statistics and the correlation matrix in one table. This should look like the following:

                  Mean | standard deviation | min | max | (1) | (2) .....

                  I have tried this with the following code, but without success. What is the correct code to get the desired output?

                  Thank you very much!

                  Also posted:
                  https://stackoverflow.com/questions/...atrix-in-stata

                  ​​​​​​​
                  Code:
                  ** Set variables used in Summary and Correlation
                  local variables tobins_q_w xrdat_w ivol_w firm_size_log_w industry_q_w growth_w lev_w os_w industry_growth_w hostility_w
                  local labels `" "Tobins q" "Innovativness" "Risk-taking" "Firm Size" "Industry q" "Growth" "Leverage" "Organizational Slack" "Environmental Munificience" "Environmental Hostility""'        
                          
                          ** Descriptive statistics
                  estpost sum tobins_q_w xrdat_w ivol_w firm_size_log_w industry_q_w growth_w lev_w os_w industry_growth_w hostility_w, detail
                  matrix table = (e(mean) \ e(sd) \ e(min) \ e(max))
                  matrix rownames table = mean sd min max
                  matrix list table
                  
                  ** Correlation matrix
                  pwcorr tobins_q_w xrdat_w ivol_w firm_size_log_w industry_q_w growth_w lev_w os_w industry_growth_w hostility_w, obs sig star(0.05)
                  matrix C = r(C)
                  local corr : rownames C
                  matrix table = ( table \ C )
                  matrix list table
                  
                  estadd matrix table = table
                  
                  local cells table[mean](fmt(2) label(Mean)) table[sd](fmt(2) label(Standard Deviation)) table[min](fmt(4) label(min)) table[max](fmt(4) label(max))
                  local collab
                  local drop
                  local i 0
                  foreach row of local corr {
                      local drop `drop' `row'
                      local cells `cells' table[`row']( fmt(4) drop(`drop') label((`++i')) )
                      local lbl : word `i' of `labels'
                      local collab `" `collab' `row' "(`i') `lbl'" "'
                  }
                  display "`cells'"
                  display `"`collab'"'
                  
                  esttab using Report.rtf, ///
                          replace ///
                          noobs ///
                          nonumbers ///
                          compress ///
                          cells("`cells'") ///
                          coeflabels(`collab')

                  Comment


                  • #10
                    Originally posted by William Lisowski View Post
                    I will start with some advice to improve your future posts. As you saw, asking for something different than what you want risks getting something you cannot make as you want, and in the worst case, leads to a solution that cannot be made to do what you want, so the real answer requires starting over from the top. (That was not the case this time.) Take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

                    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

                    With that said, the following code seems to do what you want, although it could have problems because it is late in my day and I have other things to do now.
                    Code:
                    sysuse auto, clear
                    matrix drop _all
                    
                    ** Set variables used in Summary and Correlation
                    local variables length weight displacement price
                    local labels `" "Length (in)" "Weight (lbs)" "Disp (ci)" "Price (USD)" "'
                    
                    ** Descriptive statistics
                    estpost summarize `variables'
                    matrix table = ( e(count) \ e(mean) \ e(sd) )
                    matrix rownames table = count mean sd
                    matrix list table
                    
                    ** Correlation matrix
                    correlate `variables'
                    matrix C = r(C)
                    local corr : rownames C
                    matrix table = ( table \ C )
                    matrix list table
                    
                    estadd matrix table = table
                    
                    local cells table[count](fmt(0) label(Count)) table[mean](fmt(2) label(Mean)) table[sd](fmt(2) label(Standard Deviation))
                    local collab
                    local drop
                    local i 0
                    foreach row of local corr {
                    local drop `drop' `row'
                    local cells `cells' table[`row']( fmt(4) drop(`drop') label((`++i')) )
                    local lbl : word `i' of `labels'
                    local collab `" `collab' `row' "(`i') `lbl'" "'
                    }
                    display "`cells'"
                    display `"`collab'"'
                    
                    esttab using Report.rtf, ///
                    replace ///
                    noobs ///
                    nonumbers ///
                    compress ///
                    cells("`cells'") ///
                    coeflabels(`collab')
                    [ATTACH=CONFIG]n1497639[/ATTACH]
                    Hi William,

                    Thank you for your wonderful code. However, I would like to get the correlations between the same variables to show that IT IS 1. From the code, I could not figure it out which line is supposed to be modified to displace the correlation 1. Do you have any suggestions? Thank you very much.

                    Comment


                    • #11
                      Originally posted by Kevin Nanakdewa View Post

                      William Lisowski Would also be interested to know this.
                      Would appreciate any further suggestions here.

                      Comment

                      Working...
                      X