Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Customizing a Diff-in-Diff table using Collect: A question about tagging confidence intervals

    I have created a custom table to report multiple models I am estimating using Juan VIlla's excellent diff command (ssc install diff), though my question should not at all depend on this user-written command. I am including it here to produce a working example illustrating clearly what I want to do that is not working.

    I am using Stata's new collect commands to produce a customized table of diff-in-diff results. I want the table to show the pre means with standard errors (SEs), the post means with SEs, and finally the diff-in-diff estimates and confidence intervals (CIs). Below is a working example almost producing the desired table.

    Code:
    use "http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta", clear
    collect clear
    
    * Estimate and collect regression results
    collect create reg1, replace
    collect, name(reg1): diff fte, t(treated) p(t) cov(bk kfc roys)
    
    collect create reg2, replace
    collect, name(reg2): diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)
    
    * Combine regression output and create the table
    collect combine regs = reg1 reg2
    
    collect addtags treat[0], fortags(result[mean_c0 se_c0 mean_c1 se_c1])
    collect addtags treat[1], fortags(result[mean_t0 se_t0 mean_t1 se_t1])
    collect addtags treat[2], fortags(result[diff0 se_d0 diff1 se_d1])
    collect label levels treat 0 "Control" 1 "Treated" 2 "Diff (T-C)", modify
    
    collect addtags time[1], fortags(result[mean_c0 mean_t0 se_c0 se_t0 diff0 se_d0])
    collect addtags time[2], fortags(result[mean_c1 mean_t1 se_c1 se_t1 diff1 se_d1])
    collect addtags time[3], fortags(colname[_diff]#result[_r_b])
    collect addtags time[3], fortags(result[_r_ci])
    collect label levels time 1 "Pre" 2 "Post" 3 "Pre-Post"
    
    collect addtags values[1], fortags(result[mean_c0 mean_t0 mean_c1 mean_t1 diff0 diff1] colname[_diff]#result[_r_b])
    collect addtags values[2], fortags(result[se_c0 se_t0 se_c1 se_t1 se_d0 se_d1] colname[_diff]#result[_r_ci])
    
    collect addtags treat[3], fortags(time[3]#colname[_diff]#result[_r_b])
    collect addtags treat[3], fortags(result[_r_ci])
    collect label levels treat 3 "Diff-in-Diff", modify
    
    collect label levels collection reg1 "Social disconnection Scale (1-5)" reg2 "Social disconnection scale (1-5)"
    collect style cell values[1], nformat(%3.2f) halign(center)
    collect style cell values[2], nformat(%3.2f) sformat("(%s)") halign(center)
    collect style cell result[_r_ci], cidelimiter(,) nformat(%3.2f) sformat("(%s)") halign(center)
    collect style header values, level(hide)
    collect style column, dups(center) width(equal)
    collect style cell border_block, border(right, pattern(nil))
    collect style cell cell_type[column-header]#time[1] cell_type[column-header]#time[2], border(bottom, pattern(single))
    collect style cell cell_type[column-header]#time[3] cell_type[column-header]#treat, halign(center)
    
    collect layout (collection#values) (time#treat)
    Below is the table produced by the working example code. The table looks as desired, but CIs for the diff-in-diff estimates are not displayed:

    HTML Code:
    Collection: regs
          Rows: collection#values
       Columns: time#treat
       Table 1: 4 x 7
    
    -----------------------------------------------------------------------------------------------
                           Pre                                   Post                    Pre-Post  
         -----------------------------------------------------------------------------            
            Control      Treated    Diff (T-C)     Control      Treated    Diff (T-C)  Diff-in-Diff
    -----------------------------------------------------------------------------------------------
    Reg1     21.16        18.84        -2.32        18.76        19.37        0.61         2.94    
            (1.14)       (0.85)       (1.03)       (1.16)       (0.85)       (1.04)                
    Reg2     20.04        17.07        -2.98        17.45        17.57        0.12         3.10    
            (0.67)       (0.67)       (0.94)       (0.67)       (0.67)       (0.95)                
    -----------------------------------------------------------------------------------------------
    The line (and its sibling for time[3] further down)

    Code:
    collect addtags treat[3], fortags(result[_r_ci])
    produces the message
    HTML Code:
    (0 items changed in collection regs)
    So it appears the problem is these items are not being assigned the desired tags. The documentation of collect addtags does not explicitly state that level _r_ci cannot be tagged, but perhaps this restriction is undocumented? I noticed that levels _r_ci and _r_cri cannot be recoded using collect recode, but this is explicitly documented. I would usually just accept defeat on this occasion and simply display SEs for the diff-in-diff estimate, but the principal investigator on this project specifically wants to report CIs for the diff-in-diff estimates (it is more accepted in the relevant literature). I appreciate any help or advice anyone can provide toward code that displays the CIs as desired.
    Last edited by Andrew Padovani; 10 Jan 2022, 16:14.

  • #2
    I am updating this thread with what I learned trying to find a solution on my own. I was unable to add CIs in the same row as the SEs, but with some modification I was able to produce a table with both stars on the diff-in-diff estimates and 95% CIs in a row of their own underneath the diff-in-diff estimates (which, it turns out, the previous code could not do!).

    I figured out that using collect addtags to organize the result dimension into both rows and columns causes problems; while it kind of works, it breaks other functionality like collect stars.In my application, collect addtags should be used to organize the levels of result into columns, but collect recode should then be used to organize the desired levels of result into rows. However,result levels _r_ci and _r_cri cannot be used with collect recode; the documentation states levels _r_ci and _r_cri cannot be recoded to new levels, but it turns out other levels of result also cannot be recoded to _r_ci and _r_cri. That is, attempting to recode the SEs as _r_ci using,

    Code:
    collect recode result _r_se=_r_ci
    produces the error "result _r_ci not allowed", a restriction not explicitly stated in the documentation of collect recode. As far as I can tell, there is no other way to tell Stata to place the CIs into my desired row, so it seems this is not possible at all.

    However, as stated above, the working example from #1 can be modified to produce a nice table with the desired structure, though with the CIs in their own row:

    Code:
    use "http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta", clear
    
    collect clear
    
    * Estimate and collect regression results
    // Model 1
    collect create reg1, replace
    collect, name(reg1): diff fte, t(treated) p(t) cov(bk kfc roys)
    collect addtags outcome[1], fortags(result)
    
    // Model 2
    collect create reg2, replace
    collect, name(reg2): diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)
    collect addtags outcome[2], fortags(result)
    
    * Combine regression output and create the table
    collect combine regs = reg1 reg2
    
    // Tags for treatment groups
    collect addtags treat[0], fortags(result[mean_c0 se_c0 mean_c1 se_c1])
    collect addtags treat[1], fortags(result[mean_t0 se_t0 mean_t1 se_t1])
    collect addtags treat[2], fortags(result[diff0 se_d0 diff1 se_d1])
    collect label levels treat 0 "Control" 1 "Treated" 2 "Diff (T-C)", modify
    
    // Tags for pre and post time periods
    collect addtags time[1], fortags(result[mean_c0 mean_t0 se_c0 se_t0 diff0 se_d0])
    collect addtags time[2], fortags(result[mean_c1 mean_t1 se_c1 se_t1 diff1 se_d1])
    collect label levels time 1 "Pre" 2 "Post"
    
    // Tags the diff-in-diff estimate
    collect recode colname _diff=did
    collect label levels colname did "Diff-in-Diff"
    
    // Organize the means into row one and SEs into row two
    collect recode result mean_c0=row1 mean_t0=row1 diff0=row1 mean_c1=row1 mean_t1=row1 diff1=row1 _r_b=row1
    collect recode result se_c0=row2 se_t0=row2 se_d0=row2 se_c1=row2 se_t1=row2 se_d1=row2 _r_se=row2
    collect label levels result row1 "Coeff." row2 "Std. Err."
    
    // Label each model
    collect label levels collection reg1 "Reg1" reg2 "Reg2"
    
    //  Format how each type of value is displayed
    collect style cell result[row1], nformat(%3.2f) halign(center)
    collect style cell result[row2], nformat(%3.2f) sformat("(%s)") halign(center)
    collect style cell result[_r_ci], cidelimiter(,) nformat(%3.2f) sformat("[%s]") halign(center)
    
    // Style the row and column headers to look nice
    collect style header result, level(hide)
    collect style column, dups(center) width(equal)
    collect style cell border_block, border(right, pattern(nil))
    collect style cell cell_type[column-header]#time[1] cell_type[column-header]#time[2], border(bottom, pattern(single))
    collect style cell cell_type[column-header]#time[3] cell_type[column-header]#treat, halign(center)
    collect style cell cell_type[row-header]#result[row2 _r_ci], halign(right) font(,italic)
    
    // Add stars to the diff-in-diff estimates
    collect stars _r_p 0.01 "****" 0.05 "***" 0.1 "**" 1 "*", attach(row1) clear
    
    // Define and view the table layout
    collect layout (collection#result[row1 row2 _r_ci]) (time#treat colname[did])
    produces the following table:

    HTML Code:
    Collection: regs
          Rows: collection#result[row1 row2 _r_ci]
       Columns: time#treat colname[did]
       Table 1: 6 x 7
    
    -----------------------------------------------------------------------------------------------
                           Pre                                   Post                  Diff-in-Diff
         -----------------------------------------------------------------------------            
            Control      Treated    Diff (T-C)     Control      Treated    Diff (T-C)              
    -----------------------------------------------------------------------------------------------
    Reg1     21.16        18.84        -2.32        18.76        19.37        0.61        2.94***  
            (1.14)       (0.85)       (1.03)       (1.16)       (0.85)       (1.04)       (1.46)  
                                                                                        [0.07,5.80]
    Reg2     20.04        17.07        -2.98        17.45        17.57        0.12        3.10***  
            (0.67)       (0.67)       (0.94)       (0.67)       (0.67)       (0.95)       (1.34)  
                                                                                        [0.48,5.72]
    -----------------------------------------------------------------------------------------------
    Hopefully this post is helpful to anyone trying to learn Stata 17's new suite of commands for customizing tables.
    Last edited by Andrew Padovani; 11 Jan 2022, 19:20. Reason: Some words and a typo.

    Comment

    Working...
    X