dtable confidence interval

Sa Fe

Join Date: Jun 2020

Posts: 93
#1

dtable confidence interval

06 Mar 2024, 10:18

Hi

How can I add CI's to the dtable command for both continues and factor variables?

Here is what I'm doing:

dtable, svy ///
factor(gender, stat(fvfrequency fvpercent)) /// I want to have a third column of CI for gender
cont(age , stat(count mean) ) // and same for age

collect levels result
collect composite define col1 = count fvfrequency
collect composite define col2 = mean fvpercent
collect label levels result ///
col1 "N_wt/count" ///
col2 "Pcnt_wt/avg" ///
, modify
collect style autolevels result col1 col2 _dtable_test, clear
collect style header result[col1 col2], title(hide) level(label)
Tags: ci, confidence interval, dtable, summary stats

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 658

06 Mar 2024, 19:54

dtable does not provide confidence intervals (CIs) among it's statistics. You will have to get the CIs from estimation commands into the collection created by dtable.

Here is an example using the auto dataset. Given the use of the svy option in the original post, I svyset the data and use the svy: prefix for estimating the CIs.

Code:

sysuse auto
svyset _n

local fvars foreign rep78
local cvars mpg turn trunk

dtable, svy ///
    factor(`fvars', stat(fvfrequency fvpercent)) ///
    continuous(`cvars', stat(count mean))

* collect CI's for factor variables
foreach var of local fvars {
    svy: proportion `var', percent
    collect get _r_ci
}
* collect CI's for continuous variables
foreach var of local cvars {
    svy: mean `var'
    collect get _r_ci
}

* custom composite results splitting results into 3 columns
collect composite define col1 = fvfrequency count
collect composite define col2 = fvpercent mean
collect composite define col3 = _r_lb _r_ub, trim

* adorn CIs with square brackets
collect style cell result[col3], sformat("[%s]")

* format CIs to be consisent with the percent formats
collect query cell result[fvpercent]
sreturn list
collect style cell colname[`fvars']#result[_r_lb _r_ub], nformat(`s(nformat)')

* custom result labels
collect label levels result ///
    col1 "N_wt/count" ///
    col2 "Pct_wgt/avg" ///
    col3 "__LEVEL__% CI" ///
    , modify

* reset autolevels for result to the composite results
collect style autolevels result col1 col2 col3, clear

* CI's do not show up because they are tagged with 'colname' instead of 'var'
collect layout

* change 'var' to 'colname'; -dtable- tags all results with both
* dimensions as a convenience for such situations
collect layout (colname) (result)

Here is the resulting table.

Code:

------------------------------------------------------------
                      N_wt/count Pct_wgt/avg     [95% CI]
------------------------------------------------------------
Car origin
  Domestic                    52     (70.3%)     [58.7 79.7]
  Foreign                     22     (29.7%)     [20.3 41.3]
Repair record 1978
  1                            2      (2.9%)      [0.7 11.2]
  2                            8     (11.6%)      [5.8 21.8]
  3                           30     (43.5%)     [32.1 55.6]
  4                           18     (26.1%)     [16.9 38.0]
  5                           11     (15.9%)      [8.9 26.9]
Mileage (mpg)                 74      21.297 [19.957 22.638]
Turn circle (ft.)             74      39.649 [38.629 40.668]
Trunk space (cu. ft.)         74      13.757 [12.766 14.748]
------------------------------------------------------------

Comment

Sa Fe

Join Date: Jun 2020

Posts: 93
#3

06 Mar 2024, 22:09

That sounds great. Thanks, Jeff.

I need one more thing. Is there a way that we can round those N_wt values? I looked up and people suggested that we generate another variable and do the round command to get them rounded off. However, since this is a svy, I don't think that'll work in this case. Thanks in advance.
Comment
Sa Fe

Join Date: Jun 2020

Posts: 93
#4

07 Mar 2024, 09:36

Also, how can I get the confidence interval be placed under the pct_wgt for each variable/category?
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 658

07 Mar 2024, 14:15

I'm not clear on what you mean by "round" the N_wt values, the default numeric format for fvfrequency already rounds the values to the nearest whole number.

The following example is modified from the one in #2. In it I add sampling weights with fractional values, introduce a new row dimension for stacking the CIs under the percent/mean values, define the CI composite result differently and include them in the composite result for column 2.

I flag the changes with commented EDIT in blue.

Code:

sysuse auto
svyset [pw=gear]        // <-- EDIT

local fvars foreign rep78
local cvars mpg turn trunk

dtable, svy ///
    factor(`fvars', stat(fvfrequency fvpercent)) ///
    continuous(`cvars', stat(count mean))
* add row dimension;
* this will allow us to stack select results within a column
collect addtags row[1]        // <-- EDIT

* collect CI's for factor variables
foreach var of local fvars {
    svy: proportion `var', percent
    collect get _r_ci, tags(row[2])        // <-- EDIT
}
* collect CI's for continuous variables
foreach var of local cvars {
    svy: mean `var'
    collect get _r_ci, tags(row[2])        // <-- EDIT
}

* custom composite results splitting results into 3 columns
collect composite define ci = _r_lb _r_ub, trim    // <-- EDIT
collect composite define col1 = fvfrequency count
collect composite define col2 = fvpercent mean ci    // <-- EDIT

* adorn CIs with square brackets
collect style cell result[ci], sformat("[%s]")    // <-- EDIT

* format CIs to be consisent with the percent formats
collect query cell result[fvpercent]
sreturn list
collect style cell colname[`fvars']#result[_r_lb _r_ub], nformat(`s(nformat)')

* hide the row dimension
collect style header row, title(hide) level(hide)

* custom result labels;
* assuming no label for ci given above header style
collect label levels result ///
    col1 "N_wt/count" ///
    col2 "Pct_wgt/avg" ///
    , modify

* reset autolevels for result to the composite results
collect style autolevels result col1 col2, clear

* change 'var' to 'colname'; -dtable- tags all results with both
* dimensions as a convenience for such situations
collect layout (colname#row) (result)        // <-- EDIT

Here is the resulting table.

Code:

------------------------------------------------
                      N_wt/count   Pct_wgt/avg
------------------------------------------------
Car origin
  Domestic                   146         (65.4%)
                                     [53.1 76.0]
  Foreign                     77         (34.6%)
                                     [24.0 46.9]
Repair record 1978
  1                            6          (2.8%)
                                      [0.7 10.9]
  2                           21         (10.4%)
                                      [5.1 19.9]
  3                           86         (41.6%)
                                     [30.3 53.9]
  4                           57         (27.6%)
                                     [17.9 40.0]
  5                           36         (17.6%)
                                      [9.9 29.3]
Mileage (mpg)                 74          21.830
                                 [20.408 23.251]
Turn circle (ft.)             74          39.204
                                 [38.196 40.213]
Trunk space (cu. ft.)         74          13.432
                                 [12.461 14.403]
------------------------------------------------

Comment

Sa Fe

Join Date: Jun 2020

Posts: 93
#6

08 Mar 2024, 10:18

Thanks Jeff. That helped understanding the collect command better. regarding rounding values: I'd like to round down numbers like 11,932,961 to 11,930,000 (nearest 10,000). Unfortunately, transforming the variables directly isn't feasible due to survey weights. I was wondering if we can round down the values post svy option in this case.
Comment

Sa Fe

Join Date: Jun 2020
Posts: 93

08 Mar 2024, 12:36

I was working on adding additional columns to the right side of this table, but I ran into some issues. I have a binary variable named "school_level" that indicates high school or middle school. I wanted to add two new columns (C1: N_wt and C2: pcnt_wt (95% CI)) to the right side of the table, corresponding to each level of the "school_level" variable. This would create a total of seven columns: var names, N_wt_total, pcnt_wt_total (95% CI), N_wt_middle, pcnt_wt_middle (95% CI), N_wt_high, and pcnt_wt_high (95% CI). I'm encountering two problems: 1) the columns for the totals are appearing at the end instead of after the variable names, and 2) the CIs in the total columns are missing their parentheses. Any ideas of how I can fix these?

this is the code I'm using:

Code:

collect clear
dtable, svy by(school_level) factor(gender, stat(fvfrequency fvpercent))

by school_level: collect: proportion gender, percent
collect composite define CI = _r_lb _r_ub, trim delimiter("-")
collect style cell result[CI], sformat("(%s)") nformat(%2.1fc)
collect style cell result[fvpercent], sformat("%s")

collect composite define col1 = fvpercent CI
collect composite define col2 = fvfrequency

collect style autolevels result col1 col2, clear
collect label levels result col1 "% (CI)" col2 "N" , modify
collect style header result [col1 col2], title(hide) level(label)
collect layout (colname) (school_level#result)
collect preview

this code will generate the following table.

Code:

. collect preview

-----------------------------------------------------------------------------------------------
         |                                    Education Level                                  
         |          Middle school                    High school                   Total       
         |             % (CI)           N              % (CI)           N   % (CI)            N
---------+-------------------------------------------------------------------------------------
  Gender |                                                                                     
Female   |  48.71 (48.8-50.5)   5,765,721   49.11 (47.4-49.0)   7,507,333    48.94   13,273,054
  Male   |  51.29 (49.5-51.2)   6,070,043   50.89 (51.0-52.6)   7,778,480    51.06   13,848,523
-----------------------------------------------------------------------------------------------

Last edited by Sa Fe; 08 Mar 2024, 13:33.

Announcement