Comparing means across groups using svy commands and esttab

Sofie Johansen

Join Date: Nov 2021
Posts: 4

Comparing means across groups using svy commands and esttab

11 Nov 2021, 13:33

Dear STATA users

I hope someone can help me. I am having a really hard time making a - what I thought would be a quite simple table.

I want to make a table over background factors related to placement breakdown looking similar to this one below:

	Probability of placement breakdown	Significance
Gender
Boys (n=545)	0.21	0.720
Girls (n=478)	0.23	0.720
Age
0-5 (n=56)	0.07	0.000
6-12 (n=318)	0.22	0.958
13-17 (n=587)	0.23	0.320
18-22 (n=52)	0.23	0.741

Placement breakdown is a dummyvariable called placementbreakdown 1 = placement breakdown 0 = no placement breakdown
Gender is also a dummy variable called gender 1 = boy 0 = girl.
And age consists of multiple dummy variables called age_0_5, age_6_12, age_13_17 and age_18_22.

If we start with the gender variable, I might be able to figure out the rest on my own.

My data is survey data collected in clusteres, so we use the svy-command to compensate for that.

What I have tried to do is:

. svy: mean breakdown, over(gender)
(running mean on estimation sample)

Survey: Mean estimation

Number of strata = 1 Number of obs = 1,007
Number of PSUs = 156 Population size = 1,007
Design df = 155

--------------------------------------------------------------------
| Linearized
| Mean std. err. [95% conf. interval]
-------------------+------------------------------------------------
c.breakdown@gender |
0 | .2252632 .024941 .175995 .2745313
1 | .2142857 .0237364 .1673971 .2611744
--------------------------------------------------------------------

. test [email protected][email protected]

Adjusted Wald test

( 1) [email protected] - [email protected] = 0

F( 1, 155) = 0.13
Prob > F = 0.7200

. estadd scalar p_diff = r(p)

added scalar:
e(p_diff) = .71999581

. esttab ., cells("b") stats(p_diff) nostar noabbrev nonumber eqlabels(none) collabels(Probability of placement breakdown) nomtitle

-------------------------
Probability of placement breakdown
-------------------------
[email protected] .2252632
[email protected] .2142857
-------------------------
p_diff .7199958
-------------------------

I have tried several ways but this is the closest i get to the table above. It is not very interpretable especially since there is no labels on the variables.

Are there any suggestions on how I could do this better/more interpretable?

Thank you in advance,

Sofie.

Tags: None

Andrew Musau

Join Date: Oct 2014

Posts: 10077
#2

11 Nov 2021, 15:50

See #2 here https://www.statalist.org/forums/for...tab-using-over
Comment
Sofie Johansen

Join Date: Nov 2021

Posts: 4
#3

15 Nov 2021, 09:10

Thank you Andrew! That was very helpfull. Still not exactly the result I was looking for, yet though.

This is my preliminary result:

. svy: regress breakdown ibn.gender, nocons
(running regress on estimation sample)

Survey: Linear regression

Number of strata = 1 Number of obs = 1,007
Number of PSUs = 156 Population size = 1,007
Design df = 155
F(2, 154) = 66.84
Prob > F = 0.0000
R-squared = 0.2196

------------------------------------------------------------------------------
| Linearized
breakdown | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
gender |
Girl | .2252632 .024941 9.03 0.000 .175995 .2745313
Boy | .2142857 .0237364 9.03 0.000 .1673971 .2611744
------------------------------------------------------------------------------

. mat list e(b)

e(b)[1,2]
0. 1.
gender gender
y1 .22526316 .21428571

. test 0.gender=1.gender

Adjusted Wald test

( 1) 0bn.gender - 1.gender = 0

F( 1, 155) = 0.13
Prob > F = 0.7200

. estadd scalar p_diff = r(p)

added scalar:
e(p_diff) = .71999581

. esttab, cells("b") wide se nostar label stats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber

---------------------------------
Probability of placement breakdown
---------------------------------
Girl .2252632
Boy .2142857
---------------------------------
p_diff .7199958
---------------------------------

Do you know if there is somehow I can get the variable label (in this example: "Gender of the child") automatically included in the tabel (in a collumn above the variable values or so)?

Something like this:

Probability of placement breakdown

Gender of the child

Girl .2252632

Boy .2142857

p_diff .7199958

In this example it is not really necessary (since the label values are quite informative). But I have several variables that need some kind of explanation to be interpreted e.g. yes/no-questions. You need some informations about the question to understand what "yes" and "no" stands for. I found codes where you can type a label manually, but I am looking for a way to do this automatically since I have to do it with more than 120 + variables.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10077
#4

15 Nov 2021, 09:13

Add -label- as an option in esttab. Otherwise show an example where the labels fail to appear. But note that your variables need to be labeled in the first place.
Comment
Sofie Johansen

Join Date: Nov 2021

Posts: 4
#5

17 Nov 2021, 01:46

Thank you and sorry. Thought I made an example with my last post, but maybe it more clear with anouther example. When I use the command (including the label option), I get this table:

. esttab, cells("b") wide se nostarlabelstats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber

---------------------------------
Probability of placement breakdown
---------------------------------
No .2252632
Yes .2142857
---------------------------------
p_diff .7199958
---------------------------------

What I would like is a table, that also includes the variable label. Something like this (so it is clear what yes and no stands for) :

Probability of placement breakdown

School age

No .2252632

Yes .2142857

p_diff .7199958

I know I can redefine the labels values and give them a more informative name, but again since I work with over 120+ variables, I would prefer it there is an easier way.

Hope someone can help me.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10077

17 Nov 2021, 04:42

Sorry, my mind was on value labels when I read your question in #3. By example, I mean a reproducible example in the sense of FAQ Advice #12. As long as the dependent variable in svy:regress is labeled, here is an automatic way (highlighted)

Code:

webuse nhanes2f
svyset psuid [pweight=finalwgt], strata(stratid)
svy: regress zinc ibn.race, nocons
esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `e(depvar)''"))

Res.:

Code:

. esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `e(depvar)''"))

---------------------------------
serum zinc (mcg/dL)          Mean
---------------------------------
White                       87.50
Black                       85.09
Other                       83.57
---------------------------------

Comment

Sofie Johansen

Join Date: Nov 2021

Posts: 4
#7

17 Nov 2021, 05:03

Thank you so much for your quick respond. Now I get this:

. esttab, cells("b") wide se nostar label stats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber mlab("Mean", lhs("`:var lab `e(depvar)''"))

---------------------------------
breakdown Mean
Probability of placement breakdown
---------------------------------
No .219888
Yes .21843
---------------------------------
p_diff .9683106
---------------------------------

But, actually it is the label of the independent variable I want included in the table (in your example race).

I tried to just change 'depvar' to 'indepvar', but that was not possible:

esttab, cells("b") wide se nostar label stats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber mlab("Mean", lhs("`:var lab `e(indepvars)''"))
nothing found where name expected

---------------------------------
Mean
Probability of placement breakdown
---------------------------------
No .219888
Yes .21843
---------------------------------
p_diff .9683106
---------------------------------

Any suggestions?
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10077

17 Nov 2021, 06:21

In the regression, there are two variables. It seems that you want the label of the categorical variable, e.g., gender in your case. From the stored results, the only way I see is to parse the command line to extract the variable name. From my example:

Code:

webuse nhanes2f
svyset psuid [pweight=finalwgt], strata(stratid)
svy: regress zinc ibn.race, nocons
esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `=ustrregexra("`e(command)'",  "(.*ibn\.)(.*)\,\s+\w+", "$2")''"))

Res.:

Code:

. esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `=ustrregexra("`e(command)'",  "(.*ibn\.)(.*)\,\s+\w+"
> , "$2")''"))

---------------------------------
1=white, 2=black, ~r         Mean
---------------------------------
White                       87.50
Black                       85.09
Other                       83.57
---------------------------------

Announcement

Comparing means across groups using svy commands and esttab

Comment

Comment

Comment

Comment

Comment

Comment

Comment