Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dtable confidence interval

    Hi

    How can I add CI's to the dtable command for both continues and factor variables?

    Here is what I'm doing:

    dtable, svy ///
    factor(gender, stat(fvfrequency fvpercent)) /// I want to have a third column of CI for gender
    cont(age , stat(count mean) ) // and same for age

    collect levels result
    collect composite define col1 = count fvfrequency
    collect composite define col2 = mean fvpercent
    collect label levels result ///
    col1 "N_wt/count" ///
    col2 "Pcnt_wt/avg" ///
    , modify
    collect style autolevels result col1 col2 _dtable_test, clear
    collect style header result[col1 col2], title(hide) level(label)

  • #2
    dtable does not provide confidence intervals (CIs) among it's statistics. You will have to get the CIs from estimation commands into the collection created by dtable.

    Here is an example using the auto dataset. Given the use of the svy option in the original post, I svyset the data and use the svy: prefix for estimating the CIs.

    Code:
    sysuse auto
    svyset _n
    
    local fvars foreign rep78
    local cvars mpg turn trunk
    
    dtable, svy ///
        factor(`fvars', stat(fvfrequency fvpercent)) ///
        continuous(`cvars', stat(count mean))
    
    * collect CI's for factor variables
    foreach var of local fvars {
        svy: proportion `var', percent
        collect get _r_ci
    }
    * collect CI's for continuous variables
    foreach var of local cvars {
        svy: mean `var'
        collect get _r_ci
    }
    
    * custom composite results splitting results into 3 columns
    collect composite define col1 = fvfrequency count
    collect composite define col2 = fvpercent mean
    collect composite define col3 = _r_lb _r_ub, trim
    
    * adorn CIs with square brackets
    collect style cell result[col3], sformat("[%s]")
    
    * format CIs to be consisent with the percent formats
    collect query cell result[fvpercent]
    sreturn list
    collect style cell colname[`fvars']#result[_r_lb _r_ub], nformat(`s(nformat)')
    
    * custom result labels
    collect label levels result ///
        col1 "N_wt/count" ///
        col2 "Pct_wgt/avg" ///
        col3 "__LEVEL__% CI" ///
        , modify
    
    * reset autolevels for result to the composite results
    collect style autolevels result col1 col2 col3, clear
    
    * CI's do not show up because they are tagged with 'colname' instead of 'var'
    collect layout
    
    * change 'var' to 'colname'; -dtable- tags all results with both
    * dimensions as a convenience for such situations
    collect layout (colname) (result)
    Here is the resulting table.
    Code:
    ------------------------------------------------------------
                          N_wt/count Pct_wgt/avg     [95% CI]
    ------------------------------------------------------------
    Car origin
      Domestic                    52     (70.3%)     [58.7 79.7]
      Foreign                     22     (29.7%)     [20.3 41.3]
    Repair record 1978
      1                            2      (2.9%)      [0.7 11.2]
      2                            8     (11.6%)      [5.8 21.8]
      3                           30     (43.5%)     [32.1 55.6]
      4                           18     (26.1%)     [16.9 38.0]
      5                           11     (15.9%)      [8.9 26.9]
    Mileage (mpg)                 74      21.297 [19.957 22.638]
    Turn circle (ft.)             74      39.649 [38.629 40.668]
    Trunk space (cu. ft.)         74      13.757 [12.766 14.748]
    ------------------------------------------------------------

    Comment


    • #3
      That sounds great. Thanks, Jeff.

      I need one more thing. Is there a way that we can round those N_wt values? I looked up and people suggested that we generate another variable and do the round command to get them rounded off. However, since this is a svy, I don't think that'll work in this case. Thanks in advance.

      Comment


      • #4
        Also, how can I get the confidence interval be placed under the pct_wgt for each variable/category?

        Comment


        • #5
          I'm not clear on what you mean by "round" the N_wt values, the default numeric format for fvfrequency already rounds the values to the nearest whole number.

          The following example is modified from the one in #2. In it I add sampling weights with fractional values, introduce a new row dimension for stacking the CIs under the percent/mean values, define the CI composite result differently and include them in the composite result for column 2.

          I flag the changes with commented EDIT in blue.

          Code:
          sysuse auto
          svyset [pw=gear]        // <-- EDIT
          
          local fvars foreign rep78
          local cvars mpg turn trunk
          
          dtable, svy ///
              factor(`fvars', stat(fvfrequency fvpercent)) ///
              continuous(`cvars', stat(count mean))
          * add row dimension;
          * this will allow us to stack select results within a column
          collect addtags row[1]        // <-- EDIT
          
          * collect CI's for factor variables
          foreach var of local fvars {
              svy: proportion `var', percent
              collect get _r_ci, tags(row[2])        // <-- EDIT
          }
          * collect CI's for continuous variables
          foreach var of local cvars {
              svy: mean `var'
              collect get _r_ci, tags(row[2])        // <-- EDIT
          }
          
          * custom composite results splitting results into 3 columns
          collect composite define ci = _r_lb _r_ub, trim    // <-- EDIT
          collect composite define col1 = fvfrequency count
          collect composite define col2 = fvpercent mean ci    // <-- EDIT
          
          * adorn CIs with square brackets
          collect style cell result[ci], sformat("[%s]")    // <-- EDIT
          
          * format CIs to be consisent with the percent formats
          collect query cell result[fvpercent]
          sreturn list
          collect style cell colname[`fvars']#result[_r_lb _r_ub], nformat(`s(nformat)')
          
          * hide the row dimension
          collect style header row, title(hide) level(hide)
          
          * custom result labels;
          * assuming no label for ci given above header style
          collect label levels result ///
              col1 "N_wt/count" ///
              col2 "Pct_wgt/avg" ///
              , modify
          
          * reset autolevels for result to the composite results
          collect style autolevels result col1 col2, clear
          
          * change 'var' to 'colname'; -dtable- tags all results with both
          * dimensions as a convenience for such situations
          collect layout (colname#row) (result)        // <-- EDIT
          Here is the resulting table.
          Code:
          ------------------------------------------------
                                N_wt/count   Pct_wgt/avg
          ------------------------------------------------
          Car origin
            Domestic                   146         (65.4%)
                                               [53.1 76.0]
            Foreign                     77         (34.6%)
                                               [24.0 46.9]
          Repair record 1978
            1                            6          (2.8%)
                                                [0.7 10.9]
            2                           21         (10.4%)
                                                [5.1 19.9]
            3                           86         (41.6%)
                                               [30.3 53.9]
            4                           57         (27.6%)
                                               [17.9 40.0]
            5                           36         (17.6%)
                                                [9.9 29.3]
          Mileage (mpg)                 74          21.830
                                           [20.408 23.251]
          Turn circle (ft.)             74          39.204
                                           [38.196 40.213]
          Trunk space (cu. ft.)         74          13.432
                                           [12.461 14.403]
          ------------------------------------------------

          Comment


          • #6
            Thanks Jeff. That helped understanding the collect command better. regarding rounding values: I'd like to round down numbers like 11,932,961 to 11,930,000 (nearest 10,000). Unfortunately, transforming the variables directly isn't feasible due to survey weights. I was wondering if we can round down the values post svy option in this case.

            Comment


            • #7
              I was working on adding additional columns to the right side of this table, but I ran into some issues. I have a binary variable named "school_level" that indicates high school or middle school. I wanted to add two new columns (C1: N_wt and C2: pcnt_wt (95% CI)) to the right side of the table, corresponding to each level of the "school_level" variable. This would create a total of seven columns: var names, N_wt_total, pcnt_wt_total (95% CI), N_wt_middle, pcnt_wt_middle (95% CI), N_wt_high, and pcnt_wt_high (95% CI). I'm encountering two problems: 1) the columns for the totals are appearing at the end instead of after the variable names, and 2) the CIs in the total columns are missing their parentheses. Any ideas of how I can fix these?


              this is the code I'm using:



              Code:
              collect clear
              dtable, svy by(school_level) factor(gender, stat(fvfrequency fvpercent))
              
              by school_level: collect: proportion gender, percent
              collect composite define CI = _r_lb _r_ub, trim delimiter("-")
              collect style cell result[CI], sformat("(%s)") nformat(%2.1fc)
              collect style cell result[fvpercent], sformat("%s")
              
              collect composite define col1 = fvpercent CI
              collect composite define col2 = fvfrequency
              
              collect style autolevels result col1 col2, clear
              collect label levels result col1 "% (CI)" col2 "N" , modify
              collect style header result [col1 col2], title(hide) level(label)
              collect layout (colname) (school_level#result)
              collect preview

              this code will generate the following table.


              Code:
              . collect preview
              
              -----------------------------------------------------------------------------------------------
                       |                                    Education Level                                  
                       |          Middle school                    High school                   Total       
                       |             % (CI)           N              % (CI)           N   % (CI)            N
              ---------+-------------------------------------------------------------------------------------
                Gender |                                                                                     
              Female   |  48.71 (48.8-50.5)   5,765,721   49.11 (47.4-49.0)   7,507,333    48.94   13,273,054
                Male   |  51.29 (49.5-51.2)   6,070,043   50.89 (51.0-52.6)   7,778,480    51.06   13,848,523
              -----------------------------------------------------------------------------------------------

              Last edited by Sa Fe; 08 Mar 2024, 13:33.

              Comment

              Working...
              X