Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining weighted means and SDs using -tebalance summarize- after -teffects ipw-

    Dear all,

    I am performing some observational research which uses IPTW and the -teffects ipw- package.

    After performing the analysis, I am using the -tebalance summarize- command to obtain the standardized differences, as so:
    Code:
    quietly webuse cattaneo2, clear
    
    quietly teffects ipw (bweight) (mbsmoke foreign alcohol mage medu fage fedu, logit)
    
     
    tebalance summarize mage alcohol
    
      Covariate balance summary
                                                       Raw     Weighted
                              -----------------------------------------
                              Number of obs =        4,642      4,642.0
                              Treated obs   =          864      2,238.7
                              Control obs   =        3,778      2,403.3
                              -----------------------------------------
    
      -----------------------------------------------------------------
                      |Standardized differences          Variance ratio
                      |        Raw    Weighted           Raw   Weighted
      ----------------+------------------------------------------------
                 mage |   -.300179   -.0893001      .8818025   .8381985
              alcohol |   .3222725   -.0033769      4.509207   .9828912
      -----------------------------------------------------------------
    but does anyone know how to get the actual weighted means and SDs (or weighted numbers/percentages for binary variables)?

    I have attached a figure showing what I am trying to achieve.

    Click image for larger version

Name:	image.jpg
Views:	1
Size:	100.9 KB
ID:	1419350


    The columns to the right show the weighted numbers and percentages (for binary variables) and weighted means/SDs (for continuous variables), and these are the numbers I would like to get.


    My attempt has been to use each demographic variable that I am interested in as an "outcome" in the -teffects ipw- analysis and use the POMs as the weighted means.

    Code:
    . teffects ipw (mage) (mbsmoke foreign alcohol mage medu fage fedu, logit), pom
    
    Iteration 0:   EE criterion =  3.138e-17  
    Iteration 1:   EE criterion =  1.339e-29  
    
    Treatment-effects estimation                    Number of obs     =      4,642
    Estimator      : inverse-probability weights
    Outcome model  : weighted mean
    Treatment model: logit
    ------------------------------------------------------------------------------
                 |               Robust
            mage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    POmeans      |
         mbsmoke |
      nonsmoker  |   26.45739   .0842272   314.12   0.000     26.29231    26.62247
         smoker  |   25.96681   .1190025   218.20   0.000     25.73357    26.20005
    ------------------------------------------------------------------------------
    I think this is correct for the weighted means, but I am unsure how to go from the robust SEs given to SDs. But since SE = SD / sqrt(n), perhaps it is just multiplying the SE by the square root of the number of observations in the weighted sample?

    For example:

    Code:
    . di sqrt(2403.3) * .0842272
    4.1291091
    for the control group above, and

    Code:
    . di sqrt(2238.7) * .1190025
    5.6305917
    for the treatment group

    Can anyone help?

    Many thanks, Phil

  • #2
    Alternatively, I thought for the continuous variables I could tackle the situation manually, as so:
    Code:
    . * manually calculate inverse-probability weights:
    quietly logit mbsmoke foreign alcohol mage medu fage fedu
    
    quietly predict double ps if e(sample)
    
    quietly gen double ipw = 1.mbsmoke/ps + 0.mbsmoke/(1-ps)
    
    
    
     * use -mean- to get weighted means
     
    mean mage [pw=ipw] if e(sample), over(mbsmoke)
    
    Mean estimation                   Number of obs   =      4,642
    
        nonsmoker: mbsmoke = nonsmoker
           smoker: mbsmoke = smoker
    
    --------------------------------------------------------------
            Over |       Mean   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
    mage         |
       nonsmoker |   26.45739   .0979118      26.26544    26.64934
          smoker |   25.96681   .1983074      25.57803    26.35558
    --------------------------------------------------------------
    
    . estat sd
    
        nonsmoker: mbsmoke = nonsmoker
           smoker: mbsmoke = smoker
    
    -------------------------------------
            Over |       Mean   Std. Dev.
    -------------+-----------------------
    mage         |
       nonsmoker |   26.45739    7.183891
          smoker |   25.96681    3.258869
    -------------------------------------

    Comment


    • #3
      Joerg Luedicke (StataCorp) Do you have any advice on this issue? Thank you.

      Comment


      • #4
        Hi Philip,

        You would simply use the sample standard deviation of the weighted means, where the weights are the normalized inverse-probability weights. It would also be possible to calculate the standard deviations from the standard errors, but those should be non-robust standard errors then. However, there is no need for going this route as you could simply use summarize with iweights here:

        Code:
        . * Data:
        . webuse cattaneo2, clear
        (Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)
        . 
        . * Model:
        . qui teffects ipw (bweight) (mbsmoke foreign alcohol mage medu fage fedu)
        . 
        . * IPWs:
        . predict double ps, ps tlevel(1)
        . gen double ipw = 1.mbsmoke/ps + 0.mbsmoke/(1-ps)
        . 
        . * Normalizing weights:
        . sum ipw, mean
        . qui replace ipw = ipw/r(mean)
        . 
        . * Inverse probability weighted means and SDs:
        . bysort mbsmoke: summarize mage alcohol [iw=ipw]
        
        -------------------------------------------------------------------------------
        -> mbsmoke = nonsmoker
        
            Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
        -------------+-----------------------------------------------------------------
                mage |   3,778  2403.34515    26.45739    5.73034         13         45
             alcohol |   3,778  2403.34515    .0346019   .1828073          0          1
        
        -------------------------------------------------------------------------------
        -> mbsmoke = smoker
        
            Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
        -------------+-----------------------------------------------------------------
                mage |     864  2238.65485    25.96681   5.246309         14         43
             alcohol |     864  2238.65485    .0339872   .1812367          0          1

        This corresponds to the calculations of the standardized mean differences from tebalance summarize:
        Code:
        . webuse cattaneo2, clear
        (Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)
        
        . qui teffects ipw (bweight) (mbsmoke foreign alcohol mage medu fage fedu)
        . predict double ps, ps tlevel(1)
        . gen double ipw = 1.mbsmoke/ps + 0.mbsmoke/(1-ps)
        . sum ipw, mean
        . qui replace ipw = ipw/r(mean)
        . sum mage if mbsmoke == 0 [iw=ipw]
        
            Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
        -------------+-----------------------------------------------------------------
                mage |   3,778  2403.34515    26.45739    5.73034         13         45
        
        . scalar Vc = r(Var)
        . scalar Mc = r(mean)
        . sum mage if mbsmoke == 1 [iw=ipw]
        
            Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
        -------------+-----------------------------------------------------------------
                mage |     864  2238.65485    25.96681   5.246309         14         43
        
        . scalar Vt = r(Var)
        . scalar Mt = r(mean)
        . scalar stddiff_ipw = (scalar(Mt)-scalar(Mc))/sqrt((scalar(Vt)+scalar(Vc))/2)
        . display %8.7f scalar(stddiff_ipw)
        -0.0893001
        
        . tebalance summarize mage
        
          Covariate balance summary
                                                           Raw     Weighted
                                  -----------------------------------------
                                  Number of obs =        4,642      4,642.0
                                  Treated obs   =          864      2,238.7
                                  Control obs   =        3,778      2,403.3
                                  -----------------------------------------
        
          -----------------------------------------------------------------
                          |Standardized differences          Variance ratio
                          |        Raw    Weighted           Raw   Weighted
          ----------------+------------------------------------------------
                     mage |   -.300179   -.0893001      .8818025   .8381985
          -----------------------------------------------------------------
        I hope this helps!

        Joerg

        Comment


        • #5
          Joerg,

          What a superb answer! Thank you so much for taking the time. I really hope this will benefit someone else too.

          Comment


          • #6
            Thanks Joerg,

            I was wondering how to generate baseline characteristics for categorical variables displaying n(%) after PS.

            Comment


            • #7
              . teffects ipw ($Y) ($A $W), atet
              . predict double ps, ps tlevel(1)
              . gen double ipw = 1.$A + (0.$A/(1-ps))* ps
              . sum ipw, mean
              . replace ipw = ipw/r(mean)

              . sum $Y if $A == 0 [iw=ipw]
              . scalar Vc = r(Var)
              . scalar Mc = r(mean)

              . sum $Y if $A == 1 [iw=ipw]
              . scalar Vt = r(Var)
              . scalar Mt = r(mean)
              . scalar stddiff_ipw = (scalar(Mt)-scalar(Mc))/sqrt((scalar(Vt)+scalar(Vc))/2)
              . display %8.7f scalar(stddiff_ipw)
              . tebalance summarize $Y

              Comment


              • #8
                Dear Joerg Luedicke , would you be able to illustrateas as in post #4 above, but for multivalued treatments (3 categories)? Many thanks in advance

                Comment

                Working...
                X