Hello,
I am currently in the midst of analyzing the impact of an occupational deregulation policy in the United States on job mobility within the health industry. In other words, the research question is "does this policy lead to to higher job outflows/job mobility among health workers from a given state that adopted this policy?"
8 states were identified as the treatment cohort, while 12 states act as the controls. The treatment date varies across states. My job-to-job flows data is multidimensional, with a breakdown of industry, origin state and destination state (i.e. a worker in the manufacturing industry moved from Wyoming to Nebraska). The data is quarterly, from 2000Q2 to 2016Q1. I am adopting a difference-in-differences approach.
After what seemed like a long time, my data set has been structured correctly and is finally in proper panel data format. My question now is whether my Stata codes are correct for what I am trying to achieve. My identifiers are geography_orig, geography, and industry, whereby geography_orig = origin state and geography = destination state. My time identifier is quarterly date (qdate). First question, can there be three unique identifiers as in my case?
I have already generated my time-varying treatment variable, treat_post. Now I want to run my DiD model, including dummies as controls (this way the dummies are absorbing the effects particular to each state, industry and quarter, leaving me with a more "pure" effect of the policy) and I want to cluster my standard errors at the state and year levels as I think observations from state and year correlate. This leads me to:
Note: logj2j=log of job outflows
My second question is, are these Stata codes flawed?
Your help would be highly appreciated. Thank you.
Best,
Amy
I am currently in the midst of analyzing the impact of an occupational deregulation policy in the United States on job mobility within the health industry. In other words, the research question is "does this policy lead to to higher job outflows/job mobility among health workers from a given state that adopted this policy?"
8 states were identified as the treatment cohort, while 12 states act as the controls. The treatment date varies across states. My job-to-job flows data is multidimensional, with a breakdown of industry, origin state and destination state (i.e. a worker in the manufacturing industry moved from Wyoming to Nebraska). The data is quarterly, from 2000Q2 to 2016Q1. I am adopting a difference-in-differences approach.
After what seemed like a long time, my data set has been structured correctly and is finally in proper panel data format. My question now is whether my Stata codes are correct for what I am trying to achieve. My identifiers are geography_orig, geography, and industry, whereby geography_orig = origin state and geography = destination state. My time identifier is quarterly date (qdate). First question, can there be three unique identifiers as in my case?
Code:
egen id = group(geography_orig geography industry)
Code:
xtset id qdate
Code:
egen state_year = group(geography_orig year)
Code:
reg logj2j treat_post i.geography_orig i.industry i.qdate, vce(cluster state_year)
My second question is, are these Stata codes flawed?
Your help would be highly appreciated. Thank you.
Best,
Amy
Comment