Panel data Matching

Beza Afework

Join Date: May 2024

Posts: 12
#16

18 Jun 2024, 10:38

Thank you very much, Clyde!

When I ran:

foreach var in GDP ROL Politicalstab Globalizationindex {
ttest `var', by(SR) [pw=weight] if Year < 2003
}

I got an error.

Instead, I used this:

foreach var in GDP ROL Politicalstab Globalizationindex {
reg `var' i.SR [aweight=weight] if Year < 2003
* Use lincom to simulate a t-test lincom
_b[1.SR]
}

I ran the regression and used the lincom approach to check the balance between treatment and control groups for my covariates. Is there any other way of doing it, or does this method suffice?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#17

18 Jun 2024, 12:19

Neither of these approaches is appropriate. Balance is a property of the sample, not the population from which the sample was drawn. Consequently anything you do that culiminates in a p-value and significance verdict is irrelevant and wrong here. Please refer to my earlier post in this thread on using -means- and -proportions- for these purposes.

You just want to look at the actual mean values of the variables, -pweight-ed (not -aweight-ed) by the propensity weight, in the samples and make a judgment as to whether there is enough remaining imbalance in the variables to make omitted variable bias a worry. (Enough remaining imbalance depends both on the actual differences between the means and how strongly the variables are associated with the outcome variable of your subsequent regression.)

That said, if you are going to do a propensity weighted analysis, you don't really have to concern yourself with balance as long as you are using the entire sample in which the propensity scores were calculated for your regression. The propensity weight is a predicted probability that the observation will be in the treatment group that it actually is found in. It is a statistic calculated from multiple variables. Consequently, it is possible for the weighted means of two variables, call them X1 and X2, to show imbalance across treatment groups, yet the propensity-weighted or propensity-matched analysis will work just fine because the two imbalances "cancel each other out." I suppose finding balance on each variable would provide some additional reassurance, perhaps reassurance that the propensity weights were correctly calculated. But finding the absence of balance is not a problem for a propensity weighted analysis.

By contrast, with propensity matching, it is important to demonstrate balance in the sample because usually some observations are left unmatched, or are badly matched, and omitted from the ultimate analysis, and this selection process may result in an estimation sample that is unbalanced and fails to remedy the confounding bias associated with the observed variables. With propensity weighting, however, you have the entire sample at your disposal for analysis, so the propensity weighting itself will remedy the confounding bias associated with the observed variables, even if the propensity weighted individual variable distributions are not balanced across treatment groups.
Comment
Beza Afework

Join Date: May 2024

Posts: 12
#18

18 Jun 2024, 18:33

Thanks for all your help, Clyde!

Q1. I’ve seen a few studies using Cohen’s D. Is it okay if I use that for checking balance on pre-treatment covariates, or should I stick to focusing on regression analysis?
I used this command to calculate Cohen's D:
sort SR
by SR: summarize GDP [aweight=_weight] if Year < 2003

I then calculated:
scalar mean0 = x
scalar sd0 = y
scalar n0 = z
scalar mean1 = x
scalar sd1 = y
scalar n1 = z
scalar pooled_sd = sqrt(((n0 - 1) * (sd0^2) + (n1 - 1) * (sd1^2)) / (n0 + n1 - 2))
scalar cohen_d = (mean1 - mean0) / pooled_sd

Q2. Also, when I use keep if _support, I lose some data. This seems to be because _support is marking observations within the common support region, right? It's keeping only those observations where there's overlap in propensity scores between treated and untreated groups.

Q3. Additionally, when calculating the ATT, is there a different way to interpret the results compared to what we get from Propensity Score Matching (PSM)?

Q4. One last question, can I use different matching methods like kernel, NN, etc., and present the different ATT coefficients from each method to compare their effects? And if my outcome variable is a dummy, can I still do the ATT and just explain it in terms of percentage effects? It’s just that there isn’t much on IPTW, and I’m trying to explore all options.

Sorry for the barrage of questions, but your insights have been invaluable!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#19

18 Jun 2024, 19:08

Q1. I don't think you can properly calculate Cohen's d in this context. The problem is that using -aweight-s gives you the wrong calculation of the standard deviation. You need -pweight-s, but -summarize- doesn't accept those, and the commands like -mean- that do accept -pweight-s don't calculate a standard deviation, they calculate standard error. Standard errors are different, and are not suitable for use with Cohen's d.

Q2. Yes, -keep if _support- causes you to lose the observations that are not in common support. This is one of the reasons I prefer propensity score weighting to propensity score matching. With weighting there is no need to restrict the data to the observations in common support: you use the entire sample.

Q3. I don't understand this question.

Q4. Yes, you can do any and all of these, and compare the results impressionistically if you like. But prepare yourself: what will you do if some of the methods lead you to different conclusions from the others? You will have to acknowledge in any presentation of the results that the findings are sensitive to the particular matching paradigm used.
Comment
Beza Afework

Join Date: May 2024

Posts: 12
#20

19 Jun 2024, 06:23

Thank you very much, Clyde!
Comment

Announcement

Comment

Comment

Comment

Comment

Comment