Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New command available on SSC -covbal-

    Hi fellow Stata-listers,

    Thanks, as always, to Kit Baum, a new Stata command is now available called covbal (it stands for covariate balance, and can be downloaded by typing: ssc install covbal).

    covbal produces a table of distributional test statistics (means, variances, and skewness) for each covariate specified, and assesses balance between treatment groups in the means (using the standardized difference) and in the variances (using the variance ratio).

    This command is very flexible and investigators who routinely use treatment effect estimators will find this command useful.

    I would like to give a big "thank you" to Nick Cox who greatly improved my code!

    As always, let me know if you come across any bugs, or perhaps additional measures that may be useful.

    Ariel


  • #2
    Thanks for this, looks very useful. Just one issue though: in the help file example it states:
    Code:
     *Saving the output in file named "output" in the default directory
     covbal mbsmoke mmarried mage fbaby medu, wt(iptw) abs for(%9.3f) save(output)
    which results in error "option save() not allowed", since the command asks for saving() and not save()

    Comment


    • #3
      Ahhhh, excellent catch, Ariel (nice name, by the way)! I changed the option from "save" to "saving" but forgot to change the example code... You will see that "saving" does work as advertised. I'll make sure to correct the help file...

      Ariel

      Comment


      • #4
        Hi Ariel,

        Is it possible to include the variable names in the saved .dta file? The variable names are included in the output, but they do not appear in the saved file.

        Thanks,

        Chris

        Comment


        • #5
          While coding a workaround on #4, using the Roger Newson's package qrowname, I suppose variable names can be saved into the dta file. But, although I can get it working when I retrieve variable names individually, I run into a problem when I cycle this by a loop.
          I am not sure if this is due to poor coding by me, or if this is some error in the qrowname ado file.
          So, give it a try and possibly make me aware of where the flaw is coming from:
          Code:
          ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
          * Install package, if needed
          ssc install qrowname , replace    // Extract lists of quoted row and column names from a matrix
          ssc install covbal , replace     // Covariate balance statistics
          
          * Data set up
          webuse cattaneo2, clear
          * Run example of covbal
          * Estimate propensity score for mbsmoke as the treatment, and generate inverse probability of treatment weights (IPTW)
          logit mbsmoke mmarried c.mage c.mage#c.mage fbaby i.medu
          predict pscore, pr    // probability of a positive outcome; the default
          gen iptw = cond(mbsmoke, 1/pscore, 1/(1-pscore))
          
          * Run covbal on weighted data, but reporting the standardized differences as absolute values, and changing the format to %9.3f
          covbal mbsmoke mmarried mage fbaby medu, wt(iptw) abs for(%9.3f)
          
          matrix list r(table)
          matrix mrn = r(table)
          mat li mrn
          
          qrowname mrn    // R Newson
          dis `r(fullname)'
          dis `r(name)'
          local varlist `r(name)'
          * Note that the individual retrieval of matrix row names is correct:
          local var : word 1 of `varlist'
          di `"`var'"'
          local var : word 2 of `varlist'
          di `"`var'"'
          local var : word 3 of `varlist'
          di `"`var'"'
          local var : word 4 of `varlist'
          di `"`var'"'
          
          * Note that the looped retrieval of matrix row names is not correct
          * The second row name (word) is ` instead of mage, i.e. words are shifted one position, dropping the fourth 
          qrowname mrn    // R Newson
          local varlist `r(name)'
          local n_vars `r(nrow)'
          gen str8 vars=""        // text variable to be filled with the matrix names
          forval i=1/`n_vars' {
              dis `i'
              local this_var : word `i' of "`varlist'"
              dis `"`this_var'"'
              replace vars=`"`this_var'"' in `i'
          }
          ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
          http://publicationslist.org/eric.melse

          Comment


          • #6
            Hi, anyone interested in a workaround solution for #4, here it is.

            After the kind assistance of Roger Newson, pointing at my error in #5 (you forgot that, if you assign a list of double-quoted names to a macro, then you should enclose the whole list in double quotes. (Single quotes will not do.) And a similar rule applies if you display such a list of double-quoted words using display.), this bit of syntax will provide the labels of the covariates:

            Code:
            * Set up Stata
            ssc install matnames , replace    // Austin Nichols
            ssc install qrowname , replace    // Roger B Newson
            
            * Set up data
            use cattaneo2, clear
            
            * Run covbal on unweighted data
            covbal mbsmoke mmarried mage fbaby medu, saving(output, replace)
            
            * Capture variable names in a matrix
            matnames r(table)
            mat mrn=J(4,1,0)
            mat rownames mrn=`r(r)'
            mat li mrn
            
            * Store variable names in a macro
            qrowname mrn , noisily
            global vars `"`r(name)'"'
            global n_vars `r(nrow)'
            
            * Open the saved file with summary statistics for the standardized differences and variance ratios
            use output, clear
            
            * Create a text variable that holds the labels of the covariate variables
            gen covar=""
            forval i=1/$n_vars {
                local this_var : word `i' of $vars
                dis `"`this_var'"'
                replace covar=`"`this_var'"' in `i'
            }
            order covar , b(tr_mean)
            save output, replace
            macro drop vars n_vars
            I assume a more elagant solution is possible, e.g. to circumvent the use of globals. And, preferably, it can be taken care of all together within the covbal package.

            Furthermore, note that Trenton Mize made the package balanceplot, see his webpage, that can be installed with this command
            Code:
            net install balanceplot, from("https://tdmize.github.io/data/balanceplot")
            Here is the syntax to run balanceplot using our example:
            Code:
            * Set up data
            use cattaneo2, clear
            
            * Create covariate balance plot on unweighted data
            balanceplot mmarried mage fbaby medu, group(mbsmoke) nodropdv nosort graphop(title(Imbalance in Covariates across Groups,size(*.8)))
            That results in:
            Click image for larger version

Name:	balanceplotExample.png
Views:	1
Size:	28.8 KB
ID:	1580660


            which matches the result of covbal.
            Of course, the results of covbal can also be used to create your own graph anyway you see fit.
            http://publicationslist.org/eric.melse

            Comment


            • #7
              Hi, anyone interested in a solution for #4, note that covbal has been updated and now includes the variable names in the result file.
              So, you have to run:
              Code:
              ssc install covbal , replace
              to get it to work for you as well.
              http://publicationslist.org/eric.melse

              Comment


              • #8
                Hi Ariel, so glad to have found this command. Thank you for making it. Does it distinguish between continuous and dichotomous variables?

                Comment

                Working...
                X