How can I create tables that are frequently used for publications - concrete example

Bam Bovitzki

Join Date: Jan 2023

Posts: 13
#1

How can I create tables that are frequently used for publications - concrete example

18 Jan 2023, 04:31

Dear members,

I am very new to using Stata and in particular creating tables, so I would like to ask you all if anyone can help me with this topic.

Consider, for example, the following table:

For example the deal characteristic "Completed" is one possible outcome of the variable "Status", which is given in my dataset.

How can I efficiently create such tables without having to create multiple tables and outputs and awkwardly merge them using Word?

In particular, I would like to know how to show a proportion AND in parentheses the number of observations that meet the criteria? Also, how can I create some sort of subheadings like "Deal Characteristics" and "Target Characteristics" under the column heading "Characteristics"?

I hope some of you could assist me with this issue! Thanks in advance!

Best regards
Bam

Last edited by Bam Bovitzki; 18 Jan 2023, 04:33.
Tags: tab, tables
Bam Bovitzki

Join Date: Jan 2023

Posts: 13
#2

18 Jan 2023, 10:54

Can this perhaps be created more easily via the table generator? But I wonder how I can show only the necessary observations (e.g. Status = Completed) and not all of them.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#3

18 Jan 2023, 11:12

Bam:
the -table- commands in Stata 17 are a great resource.
As the final table can be created in -word- format, you can easily delete what is not necessary.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Devra Golbe

Join Date: Apr 2014

Posts: 170
#4

18 Jan 2023, 19:52

Bam,

As Carlo pointed out, the -table commands in Stata 17 will be helpful here. See in particular this post by Chuck Huber in the Stata Blog: Customizable tables in Stata 17, part 3: the classic table 1.

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
1 like
Comment
Bam Bovitzki

Join Date: Jan 2023

Posts: 13
#5

19 Jan 2023, 05:45

Thanks for the fast replies!

I am a bit more into it now but still have the question how to get the different variable outcomes in one column without linking them together.
Let's take "Status" (one observation is "Completed") and "Tender" (one observation is "Yes") for example: My problem is that Tender is always conditioned by status.. but I would like to get both only related to the total observations and NOT tender related to status (see picture below).

I hope some of you could assist me with this issue! Thanks in advance!

Best regards
Bam
Comment
Devra Golbe

Join Date: Apr 2014

Posts: 170
#6

19 Jan 2023, 10:19

Bam,

Your issue is not clear to me. Others may have the same problem. If you follow the advice in the FAQ, particularly #12, "What should I say about the commands and data I use?" you are much more likely to get a helpful reply.

Best,
Devra

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
1 like
Comment
Bam Bovitzki

Join Date: Jan 2023

Posts: 13
#7

19 Jan 2023, 11:33

Dear Devra and others,

I am sorry that my issue is not clear for you. Let me try to explain it in more detail:

I have a dataset with many M&A transactions (e.g. when one company buys another) and variables. These variables are, for example, "Status" with the possible outcomes "Completed", "Intended", "Pending", etc. So they describe the status of the M&A deal. Another variable is "Tender_offer" (i.e. an offer to buy some or all of the shares), the result of which can be "Yes" or "No".
Now let me come to the problem or task I would like to solve:

I would like to create the table I posted at the beginning. In the table, the two variables mentioned above are used. But here, in the first column titled "Characteristics", only for example the result "Completed" from the variable "Status" and the result "Yes" from the variable "Tender_offer" are shown. In the second column, you can see the individual proportion to ALL observations and the specific number in parentheses (e.g. of x observations, 1,335 were tagged with "Completed").
How do I get such a table now? I have already tried for hours, but I fail to document the individual variables directly below each other. As you can see in my second table, "Status" is used as the main variable throughout and then the variable "Tender offer" is always applied proportionally.

I hope my issue is now clearer and some of you could help me! Thanks in advance!

Beste regards
Bam
Comment
Devra Golbe

Join Date: Apr 2014

Posts: 170
#8

19 Jan 2023, 12:13

Bam,

Maybe you chose not to follow the FAQ advice because your data are proprietary and you can't post an extract. If that's the case, I suggest you make the smallest extract which demonstrates your problem, remove any identifying firm information and, if you still think that's not appropriate for posting, change the values and/or variable names in the data extract while retaining whatever structure you need to reproduce your issue. See the following small example.

Code:

sysuse nlsw88.dta dataex idcode age married wage in 1/5

produces

Code:

* Example generated by -dataex-. For more info, type help dataex clear input int idcode byte(age married) float wage 1 37 0 11.739125 2 37 0 6.400963 3 42 0 5.016723 4 43 1 9.0338125 6 42 1 8.083731 end label values married marlbl label def marlbl 0 "Single", modify label def marlbl 1 "Married", modify

For instance, if the variable idcode is necessary to understand your problem, but you can't reveal it, make up a new identifier.

Devra

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
Comment

Bam Bovitzki

Join Date: Jan 2023
Posts: 13

19 Jan 2023, 12:32

Below you can see a small insight into the dataset with the 2 mentioned example variables:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str4 tsicp str14 status str3 tender
"5812" "Completed"      "No" 
"5065" "Withdrawn"      "No" 
"2421" "Completed"      "Yes"
"4813" "Completed"      "No" 
"4841" "Completed"      "No" 
"5075" "Completed"      "Yes"
"3571" "Completed"      "No" 
"7376" "Completed"      "No" 
"3829" "Completed"      "No" 
"3357" "Withdrawn"      "No" 
"3577" "Completed"      "No" 
"3812" "Completed"      "No" 
"6022" "Completed"      "No" 
"5531" "Status Unknown" "No" 
"8099" "Completed"      "No" 
"4813" "Withdrawn"      "No" 
"4813" "Completed"      "No" 
"7376" "Completed"      "Yes"
"3663" "Completed"      "No" 
"3613" "Withdrawn"      "No" 
end

The first column is a code for the deal, the second the "status" variable and the third the "tender offer" variable.

Best regards
Bam

Comment

Devra Golbe

Join Date: Apr 2014

Posts: 170
#10

19 Jan 2023, 12:53

Great. But you also need to show the code that produces your issue with the data extract you provide.

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
Comment
Bam Bovitzki

Join Date: Jan 2023

Posts: 13
#11

19 Jan 2023, 13:34

Well, actually I could not get that far because I do not know how to create a table as written at the beginning.

Code:

table (status tender) () (), statistic (percent) statistic (frequency) collect style header tender[No], level(hide) collect levelsof result collect label levels result percent "Proportion of sample", modify collect label levels result frequency "(number)", modify collect style cell border_block, border(right, pattern(nil)) collect preview

I think the main question is how to document the individual variables directly below each other. As you can see from my code, "Status" is used as the main variable and then the variable "Tender offer" is applied on the results of "Status".

Last edited by Bam Bovitzki; 19 Jan 2023, 13:39.
Comment

Devra Golbe

Join Date: Apr 2014
Posts: 170

#12

19 Jan 2023, 14:06

Try this. Notice how much more closely it hews to the code in Chuck Huber's blog post. (If you're wondering about the -encode- statements, try this without. That's what I did.)

Code:

clear
input str4 tsicp str14 status str3 tender
"5812" "Completed"      "No"
"5065" "Withdrawn"      "No"
"2421" "Completed"      "Yes"
"4813" "Completed"      "No"
"4841" "Completed"      "No"
"5075" "Completed"      "Yes"
"3571" "Completed"      "No"
"7376" "Completed"      "No"
"3829" "Completed"      "No"
"3357" "Withdrawn"      "No"
"3577" "Completed"      "No"
"3812" "Completed"      "No"
"6022" "Completed"      "No"
"5531" "Status Unknown" "No"
"8099" "Completed"      "No"
"4813" "Withdrawn"      "No"
"4813" "Completed"      "No"
"7376" "Completed"      "Yes"
"3663" "Completed"      "No"
"3613" "Withdrawn"      "No"
end

encode status, gen(status2)
encode tender, gen(tender2)

table, ///
statistic(fvfrequency status2 tender2) ///
statistic(fvpercent status2 tender2)
collect recode result fvfrequency = column1  ///
                      fvpercent   = column2
collect layout (var)(result[column1 column2])  
collect preview

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY

Comment

Bam Bovitzki

Join Date: Jan 2023
Posts: 13

#13

20 Jan 2023, 07:34

Dear Devra,

thank you for your solution provided! This helped me a lot and I managed to create most of the table - some additional questions so I can finish the table:

How can I add a heading like "Characteristics" to the first column 1 (under which you can see Completed, Tender offer, etc.)?
How can I add sub-headings such as "Deal characteristics" and "Target Characteristics"?
How can I add/delete spacing? When I use the command option spacer it "adds a space between levels created with different statistic() options". Hence, there is a spacing between Friendly attitude & Target Characteristics, which I would like to change.
In terms of format I have not researched a lot but maybe you have a quick solution for formatting "Proportion of sample" with only 4 digits and "(number" with brackets.

This is my current Code:

Code:

table, ///
statistic(fvfrequency status2 tender2 only_stock only_cash att_friendly) ///
statistic(fvproportion status2 tender2 only_stock only_cash att_friendly) ///
statistic (mean rankval_bn rel_size mv_bn tobins_q_trgt) ///
statistic (median rankval_bn rel_size mv_bn tobins_q_trgt)
          collect recode result fvproportion = column1 ///
         fvfrequency = column2 ///
         mean = column3 ///
         median = column4

collect layout (var [1.status2 2.tender2 1.only_stock 1.only_cash rankval_bn rel_size mv_bn tobins_q_trgt])(result[column1 column2 column3 column4])

collect label levels result column1 "Proportion of sample", modify
collect label levels result column2 "(number)", modify
collect label levels result column3 "Mean", modify
collect label levels result column4 "Median", modify

collect style header, title(hide)

collect label levels tender2 2 "Tender offer", modify
collect label levels only_stock 1 "Stock only", modify
collect label levels only_cash 1 "Cash only", modify
collect label levels only_cash 1 "Friendly attitude", modify
collect label levels var rankval_bn "Deal value (billions of US dollars)", modify
collect label levels var rel_size "Relative size (Target/Acquirer)", modify
collect label levels var mv_bn "Market value of equity (billions of US dollars)", modify
collect label levels var tobins_q_trgt "Q", modify

collect title "Panel C: Deal and firm characteristics"
collect style title, font(, italic)

collect style cell border_block, border(right, pattern(nil))

collect style row stack, spacer

collect preview

And the current table:

Thanks in advance!

Beste regards
Bam

Last edited by Bam Bovitzki; 20 Jan 2023, 07:37.

Comment

Devra Golbe

Join Date: Apr 2014

Posts: 170
#14

20 Jan 2023, 09:13

You're welcome.

Don't know the answers for questions 1-3. Re # 4: The blog post I've referenced several times talks about formatting. If you don't find the answer there, try consulting the other 6 blog posts in the series and/or the excellent Stata reference manual.

Don't forget to close this thread by reporting back on what worked for you.

Good luck!

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
1 like
Comment

Announcement