Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dtable / table Confidence Intervals

    Is it possible to add confidence intervals to the summary statistics reported in dtable or table, or do I need an alternative program so I can export the results to Word?

    Thank you.
    Julie

  • #2
    Do you have both factor and continuous variables?
    I'll assume you do. In the following I will start with dtable, then use collect to add the CIs.
    Code:
    sysuse auto
    dtable mpg turn trunk i.for i.rep
    
    * get CIs for the continuous variables
    collect: mean mpg turn trunk
    
    * redefine composite result _r_ci to suite your style
    collect composite define _r_ci = _r_lb _r_ub, trim
    collect style cell result[_r_ci], sformat("[%s]")
    
    * add _r_ci to the result autolevels
    collect style autolevels result _r_ci
    
    * change 'var' to 'colname' in the layout
    collect layout (colname) (result)
    
    * add CIs for the factor variables
    collect: proportion for rep, percent
    
    * add custom string format for the factor variable percent CI limits
    collect style cell ///
            colname[i.foreign i.rep78]#result[_r_lb _r_ub], sformat("%s%%")
    
    collect preview
    Here is the final table.
    Code:
    . collect preview
    
    ------------------------------------------------------
                              Summary         [95% CI]    
    ------------------------------------------------------
    Mileage (mpg)         21.297 (5.786)   [19.957 22.638]
    Turn circle (ft.)     39.649 (4.399)   [38.629 40.668]
    Trunk space (cu. ft.) 13.757 (4.277)   [12.766 14.748]
    Car origin                                            
      Domestic                52 (70.3%) [57.557% 79.393%]
      Foreign                 22 (29.7%) [20.607% 42.443%]
    Repair record 1978                                    
      1                         2 (2.9%)  [0.708% 11.109%]
      2                        8 (11.6%)  [5.832% 21.736%]
      3                       30 (43.5%) [32.148% 55.533%]
      4                       18 (26.1%) [16.959% 37.886%]
      5                       11 (15.9%)  [8.958% 26.770%]
    ------------------------------------------------------

    Comment


    • #3
      Is there a way to include 95% confidence intervals when you are stratifying factor variables by another factor variable using dtable?

      Thanks,
      Karen

      Comment


      • #4

        Yes, but like my above example, you have to add the CIs.

        Here is my above example reworked so that foreign is used as the
        stratifying variable. The trick here is to use the by prefix
        with the collect prefix when computing the confidence intervals.
        Code:
        sysuse auto
        dtable mpg turn trunk i.rep, by(foreign, nototal)
        
        * get CIs for the continuous variables
        by foreign: collect: mean mpg turn trunk
        
        * redefine composite result _r_ci to suite your style
        collect composite define _r_ci = _r_lb _r_ub, trim
        collect style cell result[_r_ci], sformat("[%s]")
        
        * add _r_ci to the result autolevels
        collect style autolevels result _r_ci
        
        * change 'var' to 'colname' in the layout
        collect layout (colname) (foreign#result)
        
        * add CIs for the factor variables
        by foreign: collect: proportion for rep, percent
        
        * add custom string format for the factor variable percent CI limits
        collect style cell ///
                colname[i.rep78]#result[_r_lb _r_ub], sformat("%s%%")
        
        collect preview
        Here is the resulting table.
        Code:
        . collect preview
        
        ---------------------------------------------------------------------------------------
                                                          Car origin
                                          Domestic                          Foreign
        ---------------------------------------------------------------------------------------
        Mileage (mpg)         19.827 (4.743)   [18.506 21.147] 24.773 (6.611)   [21.841 27.704]
        Turn circle (ft.)     41.442 (3.968)   [40.338 42.547] 35.409 (1.501)   [34.744 36.075]
        Trunk space (cu. ft.) 14.750 (4.306)   [13.551 15.949] 11.409 (3.217)    [9.983 12.835]
        Repair record 1978
          1                         2 (4.2%)  [1.006% 15.678%]       0 (0.0%)
          2                        8 (16.7%)  [8.405% 30.358%]       0 (0.0%)
          3                       27 (56.2%) [41.726% 69.776%]      3 (14.3%)  [4.341% 37.967%]
          4                        9 (18.8%)  [9.883% 32.687%]      9 (42.9%) [23.014% 65.297%]
          5                         2 (4.2%)  [1.006% 15.678%]      9 (42.9%) [23.014% 65.297%]
        ---------------------------------------------------------------------------------------

        Comment


        • #5
          BTW, if your variables have missing values, you should compute and collect the confidence intervals separately for each variable. dtable does not assume listwise remove of observations with missing values by default, but mean and proportion do.

          Comment


          • #6
            Hi Jeff Pitblado (StataCorp)
            Is it possible to produce this with survey data using svy function and export to excel or word?

            Comment


            • #7
              Yes. dtable has an option for svy data, and collect works with the svy: prefix.

              In the following example, based on the one in #4, I highlight the changes in blue. The result is a table showing weighted means, weighted standard deviations, survey linearized CIs for the weighted means, weighted factor-variable frequencies, weighted factor-variable percentages, and survey linearized CIs for the weighted factor-variable percentages. Note that the CIs are not constructed using subpop(). My next post will show how to produce a similar table using subpopulation methods.
              Code:
              sysuse auto
              * add some randomly generated sampling weights
              set seed 18
              gen double sampw = runiformint(10,50) + runiform()
              svyset [pw=sampw]
              
              * add -svy- option
              dtable mpg turn trunk i.rep, by(foreign, nototal) svy
              
              * get CIs for the continuous variables
              * use -svy:- prefix to encorporate weights and linearized VCE
              by foreign: collect: svy: mean mpg turn trunk
              
              * redefine composite result _r_ci to suite your style
              collect composite define _r_ci = _r_lb _r_ub, trim
              collect style cell result[_r_ci], sformat("[%s]")
              
              * add _r_ci to the result autolevels
              collect style autolevels result _r_ci
              
              * change 'var' to 'colname' in the layout
              collect layout (colname) (foreign#result)
              
              * add CIs for the factor variables
              * use -svy:- prefix to encorporate weights and linearized VCE
              by foreign: collect: svy: proportion rep, percent
              
              * add custom string format for the factor variable percent CI limits
              collect style cell ///
                      colname[i.rep78]#result[_r_lb _r_ub], sformat("%s%%")
              
              collect preview
              
              collect export mytable.xlsx, replace
              Here is the resulting table.
              Code:
              ---------------------------------------------------------------------------------------
                                                                Car origin                           
                                                Domestic                          Foreign            
              ---------------------------------------------------------------------------------------
              Mileage (mpg)         20.133 (4.865)   [18.659 21.607] 24.503 (6.144)   [21.666 27.341]
              Turn circle (ft.)     41.242 (4.028)   [40.022 42.461] 35.491 (1.282)   [34.935 36.046]
              Trunk space (cu. ft.) 14.634 (4.486)   [13.227 16.041] 11.707 (3.321)   [10.112 13.303]
              Repair record 1978                                                                     
                1                        45 (3.1%)  [0.660% 13.518%]       0 (0.0%)                  
                2                      288 (20.1%)  [9.924% 36.443%]       0 (0.0%)                  
                3                      750 (52.3%) [36.680% 67.485%]     89 (13.8%)  [3.776% 39.589%]
                4                      292 (20.3%) [10.202% 36.431%]    279 (43.2%) [21.565% 67.864%]
                5                        60 (4.2%)  [0.960% 16.285%]    277 (42.9%) [21.409% 67.512%]
              ---------------------------------------------------------------------------------------
              Here is a screen shot of this table from the Numbers app on my Mac. In the Numbers app, I used the "Fit width to content" so we can see all the numbers.


              Click image for larger version

Name:	Screenshot 2024-12-04 at 11.36.47 AM.png
Views:	1
Size:	50.9 KB
ID:	1768829

              Comment


              • #8
                The following code reproduces a similar table as the previous example, but uses subpopulation estimation.
                Code:
                sysuse auto
                * add some randomly generated sampling weights
                set seed 18
                gen double sampw = runiformint(10,50) + runiform()
                svyset [pw=sampw]
                
                collect clear
                
                collect style use dtable, replace
                
                * local macro for continuous variables
                unab cvars : mpg turn trunk
                
                * local macros for factor variables
                fvunab fvars : i.rep78
                fvrevar `fvars' , list
                local fdims = r(varlist)
                
                * continuous variable statistics
                * collect CI limits
                collect : svy: mean `cvars', over(foreign)
                * collect mean and sd
                collect : estat sd
                
                * factor variable statistics
                collect percent=_r_b: svy: proportion `fvars', percent over(foreign)
                
                * custom format for SD estimates
                collect style cell result[sd], sformat("(%s)")
                
                * custom composite result for CIs
                collect composite define ci = _r_lb _r_ub, trim
                collect style cell result[ci], sformat("[%s]")
                
                * custom formats for the factor-variable percents and their CI limits
                collect style cell (`fdims')#result[percent _r_lb _r_ub], sformat("%s%%")
                
                * custom format for the factor-variable weighted frequencies
                collect style cell result[freq], nformat("%9.0fc")
                
                * customize header styles
                collect style header foreign `fdims', title(label) level(label)
                collect style header result, title(hide) level(hide)
                
                * organize the results we want to see in the table
                collect composite define stats = mean sd freq percent, trim
                collect style autolevels result stats ci, clear
                
                * arrange items into the table
                collect layout (colname_remainder[`cvars'] `fdims') (foreign#result)
                
                * export to Excel
                collect export mytable-subpop.xlsx, replace
                The resulting table is
                Code:
                ---------------------------------------------------------------------------------------
                                                                  Car origin                          
                                                  Domestic                          Foreign            
                ---------------------------------------------------------------------------------------
                Mileage (mpg)         20.133 (4.835)   [18.674 21.592] 24.503 (6.090)   [21.828 27.179]
                Turn circle (ft.)     41.242 (4.003)   [40.035 42.448] 35.491 (1.271)   [34.967 36.014]
                Trunk space (cu. ft.) 14.634 (4.459)   [13.241 16.027] 11.707 (3.292)   [10.204 13.211]
                Repair record 1978                                                                    
                  1                          45 3.1%  [0.672% 13.311%]         0 0.0%                  
                  2                        288 20.1% [10.008% 36.228%]         0 0.0%                  
                  3                        750 52.3% [36.847% 67.327%]       89 13.8%  [4.093% 37.601%]
                  4                        292 20.3% [10.286% 36.220%]      279 43.2% [22.611% 66.525%]
                  5                          60 4.2%  [0.976% 16.055%]      277 42.9% [22.445% 66.170%]
                ---------------------------------------------------------------------------------------
                Here is a screen shot of this table from the Numbers app on my Mac. In the Numbers app, I used the "Fit width to content" so we can see all the numbers.

                Click image for larger version

Name:	Screenshot 2024-12-04 at 5.20.38 PM.png
Views:	1
Size:	49.2 KB
ID:	1768831

                Last edited by Jeff Pitblado (StataCorp); 04 Dec 2024, 17:49.

                Comment

                Working...
                X