Hello everyone!
I am stuck on the right way to model an event study analysis to produce a meaningful coefplot and would appreciate your help on the same.
I am working with KDHS 2022 reshaped data and trying to assess teenage births in Kenya before and after the COVID-19 pandemic while replicating the event study analysis of a paper on how the pandemic impacted school and cyber-bullying in the US focusing on pre- and post-pandemic trends. In doing this, I use the below commands (and data) even though I'm not sure how/where (in my regression equations) I should restrict the pre_sample and sample variables without experiencing "no observations" errors. I experience no errors in the below commands but the output is not meaningful because I need to restrict on the above mentioned samples. Also, I am wondering how to create a set of 12 fixed effects for each calendar month for my initial regression analysis. I have also attached the 2 regression equations I use in my analysis.
I am stuck on the right way to model an event study analysis to produce a meaningful coefplot and would appreciate your help on the same.
I am working with KDHS 2022 reshaped data and trying to assess teenage births in Kenya before and after the COVID-19 pandemic while replicating the event study analysis of a paper on how the pandemic impacted school and cyber-bullying in the US focusing on pre- and post-pandemic trends. In doing this, I use the below commands (and data) even though I'm not sure how/where (in my regression equations) I should restrict the pre_sample and sample variables without experiencing "no observations" errors. I experience no errors in the below commands but the output is not meaningful because I need to restrict on the above mentioned samples. Also, I am wondering how to create a set of 12 fixed effects for each calendar month for my initial regression analysis. I have also attached the 2 regression equations I use in my analysis.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long respid str2 which int v010 byte(v101 b1_) int b2_ 1 "01" 2003 32 . . 1 "02" 2003 32 . . 1 "03" 2003 32 . . 1 "04" 2003 32 . . 1 "05" 2003 32 . . 1 "06" 2003 32 . . 1 "07" 2003 32 . . 1 "08" 2003 32 . . 1 "09" 2003 32 . . 2 "01" 2006 28 . . 2 "02" 2006 28 . . 2 "03" 2006 28 . . 2 "04" 2006 28 . . 2 "05" 2006 28 . . 2 "06" 2006 28 . . 2 "07" 2006 28 . . 2 "08" 2006 28 . . 2 "09" 2006 28 . . 3 "01" 2005 44 . . 3 "02" 2005 44 . . 3 "03" 2005 44 . . 3 "04" 2005 44 . . 3 "05" 2005 44 . . 3 "06" 2005 44 . . 3 "07" 2005 44 . . 3 "08" 2005 44 . . 3 "09" 2005 44 . . 4 "01" 2007 38 . . 4 "02" 2007 38 . . 4 "03" 2007 38 . . 4 "04" 2007 38 . . 4 "05" 2007 38 . . 4 "06" 2007 38 . . 4 "07" 2007 38 . . 4 "08" 2007 38 . . 4 "09" 2007 38 . . 5 "01" 2002 28 11 2021 5 "02" 2002 28 . . 5 "03" 2002 28 . . 5 "04" 2002 28 . . 5 "05" 2002 28 . . 5 "06" 2002 28 . . 5 "07" 2002 28 . . 5 "08" 2002 28 . . 5 "09" 2002 28 . . 6 "01" 2004 26 . . 6 "02" 2004 26 . . 6 "03" 2004 26 . . 6 "04" 2004 26 . . 6 "05" 2004 26 . . 6 "06" 2004 26 . . 6 "07" 2004 26 . . 6 "08" 2004 26 . . 6 "09" 2004 26 . . 7 "01" 2005 28 . . 7 "02" 2005 28 . . 7 "03" 2005 28 . . 7 "04" 2005 28 . . 7 "05" 2005 28 . . 7 "06" 2005 28 . . 7 "07" 2005 28 . . 7 "08" 2005 28 . . 7 "09" 2005 28 . . 8 "01" 2006 17 . . 8 "02" 2006 17 . . 8 "03" 2006 17 . . 8 "04" 2006 17 . . 8 "05" 2006 17 . . 8 "06" 2006 17 . . 8 "07" 2006 17 . . 8 "08" 2006 17 . . 8 "09" 2006 17 . . 9 "01" 2003 29 . . 9 "02" 2003 29 . . 9 "03" 2003 29 . . 9 "04" 2003 29 . . 9 "05" 2003 29 . . 9 "06" 2003 29 . . 9 "07" 2003 29 . . 9 "08" 2003 29 . . 9 "09" 2003 29 . . 10 "01" 2006 3 . . 10 "02" 2006 3 . . 10 "03" 2006 3 . . 10 "04" 2006 3 . . 10 "05" 2006 3 . . 10 "06" 2006 3 . . 10 "07" 2006 3 . . 10 "08" 2006 3 . . 10 "09" 2006 3 . . 11 "01" 2004 3 5 2021 11 "02" 2004 3 . . 11 "03" 2004 3 . . 11 "04" 2004 3 . . 11 "05" 2004 3 . . 11 "06" 2004 3 . . 11 "07" 2004 3 . . 11 "08" 2004 3 . . 11 "09" 2004 3 . . 12 "01" 2004 37 . . end label values v101 V101 label def V101 3 "kilifi", modify label def V101 17 "makueni", modify label def V101 26 "trans nzoia", modify label def V101 28 "elgeyo-marakwet", modify label def V101 29 "nandi", modify label def V101 32 "nakuru", modify label def V101 37 "kakamega", modify label def V101 38 "vihiga", modify label def V101 44 "migori", modify
Code:
gen birthyear = ym(b2_, b1_) gen teenager = inrange(b2_ - v010, 15, 19) * Change format of birth year variable gen birthyear_C = birthyear format %tm birthyear_C /* Creates a binary variable pre_sample, taking the value 1 if the birth year falls within the range from 2017 to 2019, and 0 otherwise. The idea is to generate a measure of teenage births that removes calendar fixed effects and linear year trends based on pre-pandemic patterns in teenage births*/ gen pre_sample = (birthyear_C >= date("2017m1","YM")) & (birthyear_C <= date("2019m12","YM")) * Taking 2016 as the base year because the analysis begins with year 2017. By subtracting 2016 from each year, I intend to set 2017 as year 1 in the event study analysis replace b2_ = b2_ - 2016 contract birthyear_C pre_sample b2_ b1_ v101 respid if teenager, nomiss rename _freq TeenageBirths * μ m (t) indicates a set of 12 fixed effects for each calendar month and β captures a linear time trend in the years before *COVID-19 reg TeenageBirths b2_ i.b1_ // EQUATION 1: TeenageBirths = βYear t + μ m (t) + ε st , predict TeenageBirths_r, resid * After calculating the residuals (using predict above), replace missing values with zeros gen TeenageBirths_r_mi = TeenageBirths_r == . replace TeenageBirths_r = 0 if TeenageBirths_r_mi == 1 * Calculating difference in months between Jan 2019 and a specific yearmonth of birth g montht = birthyear_C-tm(2019m1) /* Calculate the month relative to February 2020 and creates indicator variables for the pre-period (preperiod), pre-months (premonth*), and post-months (postmonth*). 1/12 rep. 12 months before Feb'19 & 14/25 rep. 12 months after*/ * Having an indicator for months between Jan 2017 and Jan 2019. Then, including indicators for each of the 12 months prior to Feb 2020 g preperiod = (montht<0) forval j = 1/12 { g premonth`j' = (montht==(`j')) lab var premonth`j' " " } * Including indicators for each of the 12 months after Feb 2020 forval j = 14/25 { g postmonth`j' = (montht==(`j')) label var postmonth`j' " " } * Creating labels for variables that represent different time periods in the event study analysis. Zero is constant and represents February 2020 in the event study. g zero = 0 label var zero "Feb 20" lab var premonth1 "Feb 19" lab var premonth7 "Aug 19" lab var postmonth19 "Aug 20" lab var postmonth25 "Feb 21" * Data from Jan 2017 to June 2022 gen sample = (birthyear_C >= date("2017m1","YM")) & (birthyear_C <= date("2022m6","YM")) * For the outcome variable, generate indicator variables (*_r_mi_pre and *_r_mi_post) indicating missing values (*_r_mi) during pre-pandemic and post-pandemic periods gen TeenageBirths_r_mi_pre = (TeenageBirths_r_mi * (birthyear_C < date("2020m3","YM"))) gen TeenageBirths_r_mi_post = (TeenageBirths_r_mi * (birthyear_C >= date("2020m3","YM"))) rename v101 region * Conducting regression analysis using reghdfe command. Includes indicators for pre-period, pre-months, and post-months as independent variables along with indicator variables for missing values during the pre and post periods reghdfe TeenageBirths_r preperiod premonth* zero postmonth* TeenageBirths_r_mi_pre TeenageBirths_r_mi_post, a(region) vce(cluster region birthyear_C) //insufficient observations est sto TeenageBirths_r coefplot TeenageBirths_r, omit /// keep(zero premonth* postmonth*) /// ylabel(-1(0.5)0.5, labsize(small)) /// xlabel(, labsize(small)) /// vertical legend(off) nooffset msize(small) /// xline(13, lp(dash) lwidth(thin)) /// title("TeenageBirths", size(medsmall)) /// coeflabels(, labsize(small)) yline(0) /// mcolor(gs0) msymbol(O) ciopts(lcolor(gs0) lw(vvthin)) /// saving(a.gph, replace)