How teffects ipwra on binary outcome works?

Anthony Chan

Join Date: Jul 2020

Posts: 2
#1

How teffects ipwra on binary outcome works?

26 Jul 2020, 21:50

I was trying to understand how exactly Stata command teffects ipwra works on a binary outcome Y
by manually replicating every step described in Stata Treatment Effects Reference Manual pp 247-256.
All the numerical examples given in the manual are having a continuous outcome Y=bweight.

For a continuous Y, I can replicate all the results (POMs, ATE, ATET) produced by teffects ipwra,
by doing the following steps:
(1) use glm binomial logit to estimate a logistic regression model of the treatment 0/1 variable Z on
the covariate vector X=(1,x1,...,xp), save the fitted conditional means of Z as the propensity score
ps = P(Z=1|x), and then

(2) run two weighted linear regression models separately, one for the observed treatment outcomes Y1
on X, another for the observed control outcome Y0 on X, where the weights w are inverse-propensity
-scores: e.g., for ATE estimation, we define weights
w1 = 1/ps for treatment outcome observations Y1, w0 = 1/(1-ps) for control outcome observations Y0.
I do this step by running glm, Gaussian, identity, with analytic weights w1 and w0 as defined.

(2a) Being a weighted Gaussian regression, essentially it is equivalent to estimating an unweighted
linear regression of sqrt(w)Y on sqrt(w)X without intercept.

(3) save the linear regression fitted values Y1Hat and Y0Hat from the two regression respectively,
both for all X observations, treatment and control.
Then the sample mean of Y1Hat is POM1 and sample mean of Y0Hat is POM0, and the difference
is ATE as produced by teffect ipwra command directly.

For a binary outcome Y, I would have to do step (2) with glm binomial logit, with analytic weights
w1 and w0 as defined. The corresponding results using step (3) works out correctly, i.e., my manually
calculated POMs and ATE are the same as those produced by teffects ipwra directly.

My question now is: How do I carry out step (2a)? i.e., how to manually weight the binary outcome Y
and X using the inverse-propensity scores before running a glm binomial logit without specifying
further any analytic weights? The weighted Y values would not be binary anymore?

Could someone help explain?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

26 Jul 2020, 22:07

You shouldn't be using -aweights- in any case. The weights calculated in step 1 should be used as pweights: they are inverse probabilities of sampling. When you use them as aweights in a linear regression, you get the same coefficients anyhow, but the standard errors and other statistics derived from them are incorrect. Now, since you aren't using the latter statistics in your later steps it doesn't matter and you don't notice the error.

But then you hit a wall with a dichotomous outcome because logit does not support aweights--in fact the whole idea of aweights makes no sense with dichotomous outcomes. But -logit- does support pweights, which are the correct ones to use anyway.

Last edited by Clyde Schechter; 26 Jul 2020, 22:10.
Comment
Anthony Chan

Join Date: Jul 2020

Posts: 2
#3

27 Jul 2020, 00:01

Hi, Clyde, thanks so much for your response.
With a dichotomous outcome Y, I first run teffects ipwra for POMs showing the auxiliary equations.
Then I try to replicate the results of the auxiliary equations by running a glm command on Y with
binomial and logit option, weighted by w1 and w0 obtained from step (1), specified as aweight
for one run, and then as pweight (i.e., sampling weights) for another run. Both attempts produce
the same and correct coefficients for the auxiliary equations, i.e., same as those produced by
teffects command directly. (You are right, the standard errors are all different.)

But my question is, since the auxiliary equations coefficients are actually coefficient estimates
of a weighted logistic regression of Y on the covariates X, (whether using aweight or pweight
in a glm command), I should be able to obtain the same estimates by first manually
weighting the observations Y and X (in Gaussian linear regression, we would multiply Y and X by
the square-root of w) before running glm binomial logit on the weighted Y and X without further
specifying any weighting scheme of aweight or pweight. In other words, how do I apply the
weights w1 and w0 on the observations Y1, Y0, and X before running glm?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#4

27 Jul 2020, 12:51

I don't think that a -pweight-ed logistic regression analysis can be emulated by multiplying the Y and X variables by anything. As I understand it, the application of the pweights in this case is reflected in how the likelihood is calculated and it does not correspond to any transformation of the variables in the model. I'm not 100% certain of this, but fairly confident of this answer.
Comment
Mitch Lingo

Join Date: Jul 2018

Posts: 30
#5

01 Dec 2022, 08:07

Anthony, I am wondering if you ever came up with a working model to get logit outcomes to work. If so, could I have the syntax?

TIA
Comment

Announcement

How teffects ipwra on binary outcome works?

Comment

Comment

Comment

Comment