Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I create tables that are frequently used for publications - concrete example

    Dear members,

    I am very new to using Stata and in particular creating tables, so I would like to ask you all if anyone can help me with this topic.

    Consider, for example, the following table:
    Click image for larger version

Name:	Bildschirmfoto 2023-01-18 um 12.18.41.png
Views:	1
Size:	56.9 KB
ID:	1697723



    For example the deal characteristic "Completed" is one possible outcome of the variable "Status", which is given in my dataset.

    How can I efficiently create such tables without having to create multiple tables and outputs and awkwardly merge them using Word?

    In particular, I would like to know how to show a proportion AND in parentheses the number of observations that meet the criteria? Also, how can I create some sort of subheadings like "Deal Characteristics" and "Target Characteristics" under the column heading "Characteristics"?

    I hope some of you could assist me with this issue! Thanks in advance!

    Best regards
    Bam
    Last edited by Bam Bovitzki; 18 Jan 2023, 04:33.

  • #2
    Can this perhaps be created more easily via the table generator? But I wonder how I can show only the necessary observations (e.g. Status = Completed) and not all of them.

    Comment


    • #3
      Bam:
      the -table- commands in Stata 17 are a great resource.
      As the final table can be created in -word- format, you can easily delete what is not necessary.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Bam,

        As Carlo pointed out, the -table commands in Stata 17 will be helpful here. See in particular this post by Chuck Huber in the Stata Blog: Customizable tables in Stata 17, part 3: the classic table 1.
        Devra Golbe
        Professor Emerita, Dept. of Economics
        Hunter College, CUNY

        Comment


        • #5
          Thanks for the fast replies!

          I am a bit more into it now but still have the question how to get the different variable outcomes in one column without linking them together.
          Let's take "Status" (one observation is "Completed") and "Tender" (one observation is "Yes") for example: My problem is that Tender is always conditioned by status.. but I would like to get both only related to the total observations and NOT tender related to status (see picture below).

          Click image for larger version

Name:	Bildschirmfoto 2023-01-19 um 13.42.23.png
Views:	1
Size:	28.9 KB
ID:	1697857


          I hope some of you could assist me with this issue! Thanks in advance!

          Best regards
          Bam

          Comment


          • #6
            Bam,

            Your issue is not clear to me. Others may have the same problem. If you follow the advice in the FAQ, particularly #12, "What should I say about the commands and data I use?" you are much more likely to get a helpful reply.

            Best,
            Devra
            Devra Golbe
            Professor Emerita, Dept. of Economics
            Hunter College, CUNY

            Comment


            • #7
              Dear Devra and others,

              I am sorry that my issue is not clear for you. Let me try to explain it in more detail:

              I have a dataset with many M&A transactions (e.g. when one company buys another) and variables. These variables are, for example, "Status" with the possible outcomes "Completed", "Intended", "Pending", etc. So they describe the status of the M&A deal. Another variable is "Tender_offer" (i.e. an offer to buy some or all of the shares), the result of which can be "Yes" or "No".
              Now let me come to the problem or task I would like to solve:

              I would like to create the table I posted at the beginning. In the table, the two variables mentioned above are used. But here, in the first column titled "Characteristics", only for example the result "Completed" from the variable "Status" and the result "Yes" from the variable "Tender_offer" are shown. In the second column, you can see the individual proportion to ALL observations and the specific number in parentheses (e.g. of x observations, 1,335 were tagged with "Completed").
              How do I get such a table now? I have already tried for hours, but I fail to document the individual variables directly below each other. As you can see in my second table, "Status" is used as the main variable throughout and then the variable "Tender offer" is always applied proportionally.

              I hope my issue is now clearer and some of you could help me! Thanks in advance!

              Beste regards
              Bam


              Comment


              • #8
                Bam,

                Maybe you chose not to follow the FAQ advice because your data are proprietary and you can't post an extract. If that's the case, I suggest you make the smallest extract which demonstrates your problem, remove any identifying firm information and, if you still think that's not appropriate for posting, change the values and/or variable names in the data extract while retaining whatever structure you need to reproduce your issue. See the following small example.

                Code:
                sysuse nlsw88.dta
                dataex idcode age married wage in 1/5
                produces

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input int idcode byte(age married) float wage
                1 37 0 11.739125
                2 37 0  6.400963
                3 42 0  5.016723
                4 43 1 9.0338125
                6 42 1  8.083731
                end
                label values married marlbl
                label def marlbl 0 "Single", modify
                label def marlbl 1 "Married", modify
                For instance, if the variable idcode is necessary to understand your problem, but you can't reveal it, make up a new identifier.

                Devra
                Devra Golbe
                Professor Emerita, Dept. of Economics
                Hunter College, CUNY

                Comment


                • #9
                  Below you can see a small insight into the dataset with the 2 mentioned example variables:

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input str4 tsicp str14 status str3 tender
                  "5812" "Completed"      "No" 
                  "5065" "Withdrawn"      "No" 
                  "2421" "Completed"      "Yes"
                  "4813" "Completed"      "No" 
                  "4841" "Completed"      "No" 
                  "5075" "Completed"      "Yes"
                  "3571" "Completed"      "No" 
                  "7376" "Completed"      "No" 
                  "3829" "Completed"      "No" 
                  "3357" "Withdrawn"      "No" 
                  "3577" "Completed"      "No" 
                  "3812" "Completed"      "No" 
                  "6022" "Completed"      "No" 
                  "5531" "Status Unknown" "No" 
                  "8099" "Completed"      "No" 
                  "4813" "Withdrawn"      "No" 
                  "4813" "Completed"      "No" 
                  "7376" "Completed"      "Yes"
                  "3663" "Completed"      "No" 
                  "3613" "Withdrawn"      "No" 
                  end
                  The first column is a code for the deal, the second the "status" variable and the third the "tender offer" variable.

                  Best regards
                  Bam

                  Comment


                  • #10
                    Great. But you also need to show the code that produces your issue with the data extract you provide.
                    Devra Golbe
                    Professor Emerita, Dept. of Economics
                    Hunter College, CUNY

                    Comment


                    • #11
                      Well, actually I could not get that far because I do not know how to create a table as written at the beginning.

                      Code:
                      table (status tender) () (), statistic (percent) statistic (frequency)
                      collect style header tender[No], level(hide)
                      
                      collect levelsof result
                      collect label levels result percent "Proportion of sample", modify
                      collect label levels result frequency "(number)", modify
                      
                      collect style cell border_block, border(right, pattern(nil))
                      
                      collect preview
                      I think the main question is how to document the individual variables directly below each other. As you can see from my code, "Status" is used as the main variable and then the variable "Tender offer" is applied on the results of "Status".
                      Last edited by Bam Bovitzki; 19 Jan 2023, 13:39.

                      Comment


                      • #12
                        Try this. Notice how much more closely it hews to the code in Chuck Huber's blog post. (If you're wondering about the -encode- statements, try this without. That's what I did.)

                        Code:
                        clear
                        input str4 tsicp str14 status str3 tender
                        "5812" "Completed"      "No"
                        "5065" "Withdrawn"      "No"
                        "2421" "Completed"      "Yes"
                        "4813" "Completed"      "No"
                        "4841" "Completed"      "No"
                        "5075" "Completed"      "Yes"
                        "3571" "Completed"      "No"
                        "7376" "Completed"      "No"
                        "3829" "Completed"      "No"
                        "3357" "Withdrawn"      "No"
                        "3577" "Completed"      "No"
                        "3812" "Completed"      "No"
                        "6022" "Completed"      "No"
                        "5531" "Status Unknown" "No"
                        "8099" "Completed"      "No"
                        "4813" "Withdrawn"      "No"
                        "4813" "Completed"      "No"
                        "7376" "Completed"      "Yes"
                        "3663" "Completed"      "No"
                        "3613" "Withdrawn"      "No"
                        end
                        
                        encode status, gen(status2)
                        encode tender, gen(tender2)
                        
                        table, ///
                        statistic(fvfrequency status2 tender2) ///
                        statistic(fvpercent status2 tender2)
                        collect recode result fvfrequency = column1  ///
                                              fvpercent   = column2
                        collect layout (var)(result[column1 column2])  
                        collect preview
                        Devra Golbe
                        Professor Emerita, Dept. of Economics
                        Hunter College, CUNY

                        Comment


                        • #13
                          Dear Devra,

                          thank you for your solution provided! This helped me a lot and I managed to create most of the table - some additional questions so I can finish the table:
                          1. How can I add a heading like "Characteristics" to the first column 1 (under which you can see Completed, Tender offer, etc.)?
                          2. How can I add sub-headings such as "Deal characteristics" and "Target Characteristics"?
                          3. How can I add/delete spacing? When I use the command option spacer it "adds a space between levels created with different statistic() options". Hence, there is a spacing between Friendly attitude & Target Characteristics, which I would like to change.
                          4. In terms of format I have not researched a lot but maybe you have a quick solution for formatting "Proportion of sample" with only 4 digits and "(number" with brackets.
                          This is my current Code:

                          Code:
                          table, ///
                          statistic(fvfrequency status2 tender2 only_stock only_cash att_friendly) ///
                          statistic(fvproportion status2 tender2 only_stock only_cash att_friendly) ///
                          statistic (mean rankval_bn rel_size mv_bn tobins_q_trgt) ///
                          statistic (median rankval_bn rel_size mv_bn tobins_q_trgt)
                                    collect recode result fvproportion = column1 ///
                                   fvfrequency = column2 ///
                                   mean = column3 ///
                                   median = column4
                          
                          collect layout (var [1.status2 2.tender2 1.only_stock 1.only_cash rankval_bn rel_size mv_bn tobins_q_trgt])(result[column1 column2 column3 column4])
                          
                          collect label levels result column1 "Proportion of sample", modify
                          collect label levels result column2 "(number)", modify
                          collect label levels result column3 "Mean", modify
                          collect label levels result column4 "Median", modify
                          
                          collect style header, title(hide)
                          
                          collect label levels tender2 2 "Tender offer", modify
                          collect label levels only_stock 1 "Stock only", modify
                          collect label levels only_cash 1 "Cash only", modify
                          collect label levels only_cash 1 "Friendly attitude", modify
                          collect label levels var rankval_bn "Deal value (billions of US dollars)", modify
                          collect label levels var rel_size "Relative size (Target/Acquirer)", modify
                          collect label levels var mv_bn "Market value of equity (billions of US dollars)", modify
                          collect label levels var tobins_q_trgt "Q", modify
                          
                          collect title "Panel C: Deal and firm characteristics"
                          collect style title, font(, italic)
                          
                          collect style cell border_block, border(right, pattern(nil))
                          
                          collect style row stack, spacer
                          
                          collect preview
                          And the current table:
                          Click image for larger version

Name:	Bildschirmfoto 2023-01-20 um 15.32.15.png
Views:	1
Size:	45.5 KB
ID:	1698013

                          Thanks in advance!

                          Beste regards
                          Bam
                          Last edited by Bam Bovitzki; 20 Jan 2023, 07:37.

                          Comment


                          • #14
                            You're welcome.

                            Don't know the answers for questions 1-3. Re # 4: The blog post I've referenced several times talks about formatting. If you don't find the answer there, try consulting the other 6 blog posts in the series and/or the excellent Stata reference manual.

                            Don't forget to close this thread by reporting back on what worked for you.

                            Good luck!
                            Devra Golbe
                            Professor Emerita, Dept. of Economics
                            Hunter College, CUNY

                            Comment

                            Working...
                            X