Hi everyone. I would need your help with the following
I'm running a dif-in-dif analysis and the first stage is to match each observation in the treated group with one ob in the control group by nearest neighbour propensity score.
For simplicity, I use the following sample data
use http://ssc.wisc.edu/sscc/pubs/files/psm,replace
( a treatment indicator t, covariates x1 and x2, and an outcome y)
Then, I use psmatch2 for propensity score match:
psmatch2 t x1 x2, out(y) logit
Now I have new id (generated by stata as _id) of treated observations and id of the matched control observations for each pair. After dropping obs in the control group that are not matched with any obs in the treated group, I now have a new sample
Next, I want to run a regression to test the effect of the treatment and I want the variable t (1 for treated and 0 for control) to capture the difference between treated and control for each pair (that was matched before in the propensity score match). I got confused at this stage because if I simply run:
reg y t x1 x2
then what t captures is the average difference between the whole treated group and the whole control group, instead of the difference for each pair.
Can you please suggest how I can solve this.
Thank you so much
I'm running a dif-in-dif analysis and the first stage is to match each observation in the treated group with one ob in the control group by nearest neighbour propensity score.
For simplicity, I use the following sample data
use http://ssc.wisc.edu/sscc/pubs/files/psm,replace
( a treatment indicator t, covariates x1 and x2, and an outcome y)
Then, I use psmatch2 for propensity score match:
psmatch2 t x1 x2, out(y) logit
Now I have new id (generated by stata as _id) of treated observations and id of the matched control observations for each pair. After dropping obs in the control group that are not matched with any obs in the treated group, I now have a new sample
Next, I want to run a regression to test the effect of the treatment and I want the variable t (1 for treated and 0 for control) to capture the difference between treated and control for each pair (that was matched before in the propensity score match). I got confused at this stage because if I simply run:
reg y t x1 x2
then what t captures is the average difference between the whole treated group and the whole control group, instead of the difference for each pair.
Can you please suggest how I can solve this.
Thank you so much
Comment