I have unbalanced panel data. The data includes a control group and a treated group (dummy variable >> VC_backing =1 if the firm received the treatment).
I have another variable (VC_year) which represents the year in which a particular company received the treatment.
I want to use Propensity score matching to match between similar companies in the treatment and control groups based on certain characteristics (firm size and age) and they have to be in the same country and industry,
but the matching must be based on the VC_backing year (the treatment year). In other words, the matched control group must be matched on the same treatment year (VC_year). Because after that I want to compare the dependent variables in the years before and after receiving the treatment.
The variable VC_year is missing for the control group because they didn’t receive the treatment. However, the data is a panel and I have (year) variable that represents the year of the observation.
I want to use the nearest neighbor matching with replacement. In particular, I want to match 1 firm in the treatment group with 3 firms in the control group (1 to 3).
The control variables that should determine the p scores are firm size (ta) and age. They also need to be in the same country (ctryiso_num) and industry(naceccod2).
I am using Stata 14, I installed the psmatch2 package and used the help command to read about it, but I am still not sure how to do this correctly because I haven’t used PSM before.
I kindly request the assistance of anyone who has experience in this topic. Your support would be very meaningful to me.
This is an example of my data
----------------------- copy starting from the next line -----------------------
I have another variable (VC_year) which represents the year in which a particular company received the treatment.
I want to use Propensity score matching to match between similar companies in the treatment and control groups based on certain characteristics (firm size and age) and they have to be in the same country and industry,
but the matching must be based on the VC_backing year (the treatment year). In other words, the matched control group must be matched on the same treatment year (VC_year). Because after that I want to compare the dependent variables in the years before and after receiving the treatment.
The variable VC_year is missing for the control group because they didn’t receive the treatment. However, the data is a panel and I have (year) variable that represents the year of the observation.
I want to use the nearest neighbor matching with replacement. In particular, I want to match 1 firm in the treatment group with 3 firms in the control group (1 to 3).
The control variables that should determine the p scores are firm size (ta) and age. They also need to be in the same country (ctryiso_num) and industry(naceccod2).
I am using Stata 14, I installed the psmatch2 package and used the help command to read about it, but I am still not sure how to do this correctly because I haven’t used PSM before.
I kindly request the assistance of anyone who has experience in this topic. Your support would be very meaningful to me.
This is an example of my data
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(bvdid_num year ta age) long ctryiso_num int naceccod2 byte vc_backing float com_first_year 1 2012 212588.8 45 1 3700 0 . 1 2013 219870.1 46 1 3700 0 . 1 2014 262236.06 47 1 3700 0 . 1 2015 318467.4 48 1 3700 0 . 1 2016 329191.6 49 1 3700 0 . 1 2017 322287.75 50 1 3700 0 . 1 2018 510383.7 51 1 3700 0 . 1 2019 479433.9 52 1 3700 0 . 1 2020 495711.8 53 1 3700 0 . 1 2021 496129.3 54 1 3700 0 . 2 2010 124741.8 40 1 6120 0 . 2 2011 115580.33 41 1 6120 0 . 2 2012 111971.28 42 1 6120 0 . 2 2013 112442.6 43 1 6120 0 . 2 2014 108788.25 44 1 6120 0 . 2 2015 109198.38 45 1 6120 0 . 2 2016 111757.75 46 1 6120 0 . 2 2017 113605.73 47 1 6120 0 . 2 2018 114290.19 48 1 6120 0 . 2 2019 123616.23 49 1 6120 0 . 34 2014 11686.753 110 1 1105 1 2016 34 2015 13607.967 111 1 1105 1 2016 34 2016 18987.213 112 1 1105 1 2016 34 2017 20940.56 113 1 1105 1 2016 34 2018 19529.773 114 1 1105 1 2016 34 2019 19610.596 115 1 1105 1 2016 34 2020 25111.95 116 1 1105 1 2016 145 2003 2837 65 1 5224 1 2005 145 2004 3879 66 1 5224 1 2005 145 2005 4311 67 1 5224 1 2005 145 2006 4600.03 68 1 5224 1 2005 145 2008 4959.061 70 1 5224 1 2005 145 2009 10377.208 71 1 5224 1 2005 145 2010 6113.083 72 1 5224 1 2005 145 2011 6649.146 73 1 5224 1 2005 145 2012 6359.713 74 1 5224 1 2005 151 2001 7303 55 1 4662 1 2007 151 2002 7364 56 1 4662 1 2007 151 2003 7488 57 1 4662 1 2007 151 2004 8126 58 1 4662 1 2007 151 2005 5692 59 1 4662 1 2007 151 2006 3843.424 60 1 4662 1 2007 151 2007 2140.994 61 1 4662 1 2007 151 2008 707.467 62 1 4662 1 2007 151 2009 69.864 63 1 4662 1 2007 207 2000 17721 31 1 6202 1 2007 207 2001 15426 32 1 6202 1 2007 207 2002 12638 33 1 6202 1 2007 207 2003 12682 34 1 6202 1 2007 207 2004 12894 35 1 6202 1 2007 207 2005 15284 36 1 6202 1 2007 207 2006 16738 37 1 6202 1 2007 207 2007 18895 38 1 6202 1 2007 207 2011 12888.366 42 1 6202 1 2007 207 2012 13307.892 43 1 6202 1 2007 207 2013 12972.443 44 1 6202 1 2007 207 2014 15326.5 45 1 6202 1 2007 207 2015 16317.643 46 1 6202 1 2007 207 2016 16755.066 47 1 6202 1 2007 207 2017 17763.408 48 1 6202 1 2007 207 2018 21329.86 49 1 6202 1 2007 207 2019 23353.957 50 1 6202 1 2007 207 2020 21814.285 51 1 6202 1 2007 253 2000 10881.36 29 1 2740 1 2007 253 2001 10124 30 1 2740 1 2007 253 2002 11310 31 1 2740 1 2007 253 2003 13426 32 1 2740 1 2007 253 2004 16621 33 1 2740 1 2007 253 2005 15930 34 1 2740 1 2007 253 2006 23089.686 35 1 2740 1 2007 253 2007 20245.455 36 1 2740 1 2007 253 2008 18595.584 37 1 2740 1 2007 253 2009 11728.872 38 1 2740 1 2007 312 2002 8445 28 1 1012 1 2008 312 2003 10049 29 1 1012 1 2008 312 2004 12732 30 1 1012 1 2008 312 2005 15636 31 1 1012 1 2008 312 2006 20662.77 32 1 1012 1 2008 312 2007 27903.795 33 1 1012 1 2008 312 2008 32615.61 34 1 1012 1 2008 312 2009 37251.887 35 1 1012 1 2008 312 2010 33914.9 36 1 1012 1 2008 312 2011 34649.316 37 1 1012 1 2008 end label values ctryiso_num ctryiso_num label def ctryiso_num 1 "BE", modify
Comment