Dear All,
Your help with propensity score matching method in STATA will be greatly appreciated. I have access to STATA 12. All questions refer to commands employed in STATA 12.
The focus of my research study is: Do EU structural fund support in Latvia stimulate firms to invest additional funds in their research and development (R&D) activities. I want to see if EU innovation support programms either complement or crowd-out (substitute) private R&D investment.
I want to use propensity score matching method to estimate the average causal effect of participating in the programm using a cross-sectional microdataset based on the Latvian edition of Eurostat’s community innovation survey for 2010-2012. The sample consists of 121 firms in the treatment group and 307 in the control group. Treatment variable is dummy for having/not having received support from the program and outcome variable is the amount of R&D expenditure a company reported in 2012.
The questions.
Your help with propensity score matching method in STATA will be greatly appreciated. I have access to STATA 12. All questions refer to commands employed in STATA 12.
The focus of my research study is: Do EU structural fund support in Latvia stimulate firms to invest additional funds in their research and development (R&D) activities. I want to see if EU innovation support programms either complement or crowd-out (substitute) private R&D investment.
I want to use propensity score matching method to estimate the average causal effect of participating in the programm using a cross-sectional microdataset based on the Latvian edition of Eurostat’s community innovation survey for 2010-2012. The sample consists of 121 firms in the treatment group and 307 in the control group. Treatment variable is dummy for having/not having received support from the program and outcome variable is the amount of R&D expenditure a company reported in 2012.
The questions.
- I have some 20 covariates in probit. Most of them have p-values close to 100%. Some have around 30%. Two of them have values below 5%. How is the propensity score calculated in this case? Does it take into account only variables whose significance is below 5% or all of them? If no coeffcient has a p-value below 5% how is then the pscore is calculated and does the pscore matching makes statistical sense?
- Also, I estimate pscores in both T and C group and try to find overlap between them in both groups. Then it follows, that some individuals in C group have higher probability of receiving treatment than those who were actually treated. How is that possible?
- Generally it is confusing that you need to calculate propensity score for treated individuals since they are treated and would have 100% probability of receiving the treatment. Where is the logic here?
- How do we account for the fact that we don’t know when the subsidy was received in the three year period the data covers however the R&D expenditures are reported for just the most recent year?
- Which matching type (kernel, nearest neighb with/ nneighb without replacement, caliper, etc) would you recommend for our case with sample size of 121 firms in the treatment group and 307 in the control group and most of coefficients being insignificant at 5% in probit?
- In STATA 12 what commands would you recommend to obtain the significance level of the Average treatment effect?
- How to evaluate what is satisfactory overlap of propensity scores, since if the overlap is small the method is unappropriate?
- The dataset also has sampling weights. How to account for those in probit, t-tests and matching? How do they change the analysis and do we need to bother with them at all?
- How should the variables be modified/transformed and which variables should be included as control variables to make more coefficients significant in probit and improve the overall matching quality?
- What is common support and is it important to us?
Comment