Table with multiple variables displayed horizontally and nested columns saved as a .txt file

Mariia Chavez

Join Date: Oct 2023

Posts: 28
#1

Table with multiple variables displayed horizontally and nested columns saved as a .txt file

27 Jun 2024, 07:02

Hello

I am trying to figure out how to make a table with one variable as the rows, and several variables nested as the columns.

Attached is an illustration of the table I am attempting to make. In the displayed table I filled out some of the cells manually just to illustrate, based on the information from more simple tables I was able to make in stata .

As I have previously received help from you for presenting multiple columns and saving it as a .txt file (previous thread: Merging tables horizontally(multiple columns) as .txt files in stata, using tabout. In order to copy-paste to excel. - Statalist), I have attempted to build on this code for the current conundrum as well.

Unlike last time, and as displayed in the picture attached , I now only have a need for one variable as my rows. In addition to this, I want my column variables presented as nested where the total, and the variables collgrad, married and race are presented repeatedly, for each of the industry values/categories.

This is my latest attempt:

ssc install estout, clear

webuse nlsw88, clear
estimates drop _all
local i 0
local vars "occupation"
foreach var1 of local vars{
foreach var2 of varlist collgrad married race{
local ++i
eststo m`i': estpost tab `var1' `var2'
}
local opt= cond("`var1'"==word("`vars'", 1), "replace nonumb", "append eqlab(none) nonumb collab(none)")
esttab m* using myfile3.txt, cells(colpct(fmt(2))) unstack drop(Total: Total) `opt'
estimates drop _all
}

However, I was not able to figure out how to achieve the nested columns where the variables collgrad, married and race are repeated for each category of industry. Nor was I able to add a column for the total.

Can you help me with this, please?

Like last time, I would also prefer that the output is placed in a .txt file, so that I can copy paste into an excel file

My actual dataset that I am trying to make the tables from is based on a survey that is weighted with the variable "weightcorr_n". In this dataset all "Refuse to answer" and "I don't know" were registered as missing data. That said, we want to include these in the table for the total to be accurate. With a simple tab function I know that I can fix this by adding "missing", for instance like so:

tab occupation race if industry==1 [aw = weightcorr_n], col missing nofreq

I am using nlsw88 as my example dataset as it was the only example dataset I could find with enough variables that also had subcategories in order to achieve the multiple and nested columns. However, due to it not being survey data, and without data that is registered as missing, I am not sure how to add these commands to the code for the example dataset in a meaningful way.
Hence, I would be very grateful if you have any pointers as to how I could adapt your suggested codes to a survey dataset, including the weighting command and also a way to add the missing data.

Best,
Maria

Attached Files
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 9944

27 Jun 2024, 18:59

You can try something like:

Code:

webuse nlsw88, clear
estimates drop _all
local i 0
local vars "occupation collgrad south c_city"
foreach var1 of local vars{
    foreach var2 of varlist married union race{
        local ++i
        eststo Mining`i': estpost tab `var1' `var2' if industry==2
        eststo Construction`i': estpost tab `var1' `var2' if industry==3
        eststo Manufacturing`i': estpost tab `var1' `var2' if industry==4
    }
    local opt= cond("`var1'"==word("`vars'", 1), "replace nonumb", "append eqlab(none) nonumb collab(none)")
    esttab Mining* Construction* Manufacturing* using myfile.csv, cells(colpct(fmt(2))) unstack drop(Total: Total) `opt' mtitles(Mining1)
    estimates drop _all
}

The idea is to prefix each set of models with the name of the industry. One can loop, but this is easier to illustrate. For missing values, one way is to recode them to some nonmissing value not in the dataset, such as "9999". Then after creating the tables, recode them back to missing.

Code:

help recode

Res.:

Click image for larger version

Name: Capture.PNG
Views: 1
Size: 134.5 KB
ID: 1757362

Comment

Mariia Chavez

Join Date: Oct 2023

Posts: 28
#3

28 Jun 2024, 07:52

Thank you so much for this!

I made some slight adjustments to your command, so that the resulting table would be easier to compare to the illustrative table I had made manually:

webuse nlsw88, clear
estimates drop _all
local i 0
local vars "occupation"
foreach var1 of local vars{
foreach var2 of varlist collgrad married race{
local ++i
eststo Agriculture`i': estpost tab `var1' `var2' if industry==1
eststo Mining`i': estpost tab `var1' `var2' if industry==2
eststo Construction`i': estpost tab `var1' `var2' if industry==3
}
local opt= cond("`var1'"==word("`vars'", 1), "replace nonumb", "append eqlab(none) nonumb collab(none)")
esttab Agriculture* Mining* Construction* using myfile4.csv, cells(colpct(fmt(2))) unstack drop(Total: Total) `opt' mtitles(Agriculture1)
estimates drop _all
}

The resulting table is shown in the attached picture. I was able to get it to present the correct variables as columns and nested like I wanted (thank you for this!) but, the total N's are presented per industry-levels (industry==1, industry==2, etc), rather than per column (e.g., Agriculture x Not college, Agriculture x college, Agriculture x Single, Agriculture x Married, etc). Do you have any suggestions for how I can achieve this?
Also is there any way to get it to display the variable levels also for those that have no participants in them, such as the three occupation levels (Transport, Household, Other), and the missing race levels in Mining?

Best,
Maria
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 9944

29 Jun 2024, 04:47

You can place the cell counts in a separate column and do away with the observation counts.

Code:

webuse nlsw88, clear
estimates drop _all
local i 0
local vars "occupation"
foreach var1 of local vars{
foreach var2 of varlist collgrad married race{
local ++i
eststo Agriculture`i': estpost tab `var1' `var2' if industry==1
eststo Mining`i': estpost tab `var1' `var2' if industry==2
eststo Construction`i': estpost tab `var1' `var2' if industry==3
}
local opt= cond("`var1'"==word("`vars'", 1), "replace nonumb", "append eqlab(none) nonumb collab(none)")
esttab Agriculture* Mining* Construction* using myfile4.csv, cells("colpct(fmt(2)) b(label(N)  fmt(0))") unstack drop(Total: ) `opt' mtitles(Agriculture1) noobs order(Total)
estimates drop _all
}

Unfortunately, neither tabstat nor table has a method to handle absent categories. If you use a command such as fillin to add the absent categories, this will mess up the tabulations (i.e., these will no longer be based on actual counts as you have added observations). So perhaps add the absent categories by hand after creating the table.

Click image for larger version

Name: Capture.PNG
Views: 1
Size: 67.2 KB
ID: 1757422

Last edited by Andrew Musau; 29 Jun 2024, 04:52.

Announcement