Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How teffects ipwra on binary outcome works?

    I was trying to understand how exactly Stata command teffects ipwra works on a binary outcome Y
    by manually replicating every step described in Stata Treatment Effects Reference Manual pp 247-256.
    All the numerical examples given in the manual are having a continuous outcome Y=bweight.

    For a continuous Y, I can replicate all the results (POMs, ATE, ATET) produced by teffects ipwra,
    by doing the following steps:
    (1) use glm binomial logit to estimate a logistic regression model of the treatment 0/1 variable Z on
    the covariate vector X=(1,x1,...,xp), save the fitted conditional means of Z as the propensity score
    ps = P(Z=1|x), and then

    (2) run two weighted linear regression models separately, one for the observed treatment outcomes Y1
    on X, another for the observed control outcome Y0 on X, where the weights w are inverse-propensity
    -scores: e.g., for ATE estimation, we define weights
    w1 = 1/ps for treatment outcome observations Y1, w0 = 1/(1-ps) for control outcome observations Y0.
    I do this step by running glm, Gaussian, identity, with analytic weights w1 and w0 as defined.

    (2a) Being a weighted Gaussian regression, essentially it is equivalent to estimating an unweighted
    linear regression of sqrt(w)Y on sqrt(w)X without intercept.

    (3) save the linear regression fitted values Y1Hat and Y0Hat from the two regression respectively,
    both for all X observations, treatment and control.
    Then the sample mean of Y1Hat is POM1 and sample mean of Y0Hat is POM0, and the difference
    is ATE as produced by teffect ipwra command directly.

    For a binary outcome Y, I would have to do step (2) with glm binomial logit, with analytic weights
    w1 and w0 as defined. The corresponding results using step (3) works out correctly, i.e., my manually
    calculated POMs and ATE are the same as those produced by teffects ipwra directly.

    My question now is: How do I carry out step (2a)? i.e., how to manually weight the binary outcome Y
    and X using the inverse-propensity scores before running a glm binomial logit without specifying
    further any analytic weights? The weighted Y values would not be binary anymore?

    Could someone help explain?


  • #2
    You shouldn't be using -aweights- in any case. The weights calculated in step 1 should be used as pweights: they are inverse probabilities of sampling. When you use them as aweights in a linear regression, you get the same coefficients anyhow, but the standard errors and other statistics derived from them are incorrect. Now, since you aren't using the latter statistics in your later steps it doesn't matter and you don't notice the error.

    But then you hit a wall with a dichotomous outcome because logit does not support aweights--in fact the whole idea of aweights makes no sense with dichotomous outcomes. But -logit- does support pweights, which are the correct ones to use anyway.
    Last edited by Clyde Schechter; 26 Jul 2020, 22:10.

    Comment


    • #3
      Hi, Clyde, thanks so much for your response.
      With a dichotomous outcome Y, I first run teffects ipwra for POMs showing the auxiliary equations.
      Then I try to replicate the results of the auxiliary equations by running a glm command on Y with
      binomial and logit option, weighted by w1 and w0 obtained from step (1), specified as aweight
      for one run, and then as pweight (i.e., sampling weights) for another run. Both attempts produce
      the same and correct coefficients for the auxiliary equations, i.e., same as those produced by
      teffects command directly. (You are right, the standard errors are all different.)

      But my question is, since the auxiliary equations coefficients are actually coefficient estimates
      of a weighted logistic regression of Y on the covariates X, (whether using aweight or pweight
      in a glm command), I should be able to obtain the same estimates by first manually
      weighting the observations Y and X (in Gaussian linear regression, we would multiply Y and X by
      the square-root of w) before running glm binomial logit on the weighted Y and X without further
      specifying any weighting scheme of aweight or pweight. In other words, how do I apply the
      weights w1 and w0 on the observations Y1, Y0, and X before running glm?

      Comment


      • #4
        I don't think that a -pweight-ed logistic regression analysis can be emulated by multiplying the Y and X variables by anything. As I understand it, the application of the pweights in this case is reflected in how the likelihood is calculated and it does not correspond to any transformation of the variables in the model. I'm not 100% certain of this, but fairly confident of this answer.

        Comment


        • #5
          Anthony, I am wondering if you ever came up with a working model to get logit outcomes to work. If so, could I have the syntax?

          TIA

          Comment

          Working...
          X