Hello everyone,
I learned that not having the same number of observations throughout the years used for a DID estimation is not a problem.
But, I am still doubtful about something.
I have a public policy that is implemented at the county level, and I am looking at the effect of this policy at the individual level using individual data. So data is not a real panel, it is the regroupment of different individuals born in different years.
The data I use is 3 years before the policy and 2 years after. The policy is in January 1998, I take: 1995, 96 and 97 for the pre-period and 98 and 99 for the post period.
The problem is that in 1999, I have a very low number of observations: for example, in 1998, I have 110 observation, while in 1999, I have 19 observations. This is due to the fact that those born in 1999 are not old enough during the survey to report their outcomes (so the outcome is a test score, do these individuals are taking the test during the year of the survey, so not everyone already have access to this information yet).
I would be really grateful for your answers!
Let me know if you need more information about my data.
I learned that not having the same number of observations throughout the years used for a DID estimation is not a problem.
But, I am still doubtful about something.
I have a public policy that is implemented at the county level, and I am looking at the effect of this policy at the individual level using individual data. So data is not a real panel, it is the regroupment of different individuals born in different years.
The data I use is 3 years before the policy and 2 years after. The policy is in January 1998, I take: 1995, 96 and 97 for the pre-period and 98 and 99 for the post period.
The problem is that in 1999, I have a very low number of observations: for example, in 1998, I have 110 observation, while in 1999, I have 19 observations. This is due to the fact that those born in 1999 are not old enough during the survey to report their outcomes (so the outcome is a test score, do these individuals are taking the test during the year of the survey, so not everyone already have access to this information yet).
- I still want to include those born in 1999, to increase the sample size, but does this pose a problem for my DID?
- and if I want to look at the effect by year of birth, using the interaction of the treatment variable and indicators for year of birth, is the coefficient for the year 1999 reliable or not?
I would be really grateful for your answers!
Let me know if you need more information about my data.
Comment