Dear Statalist Forum,
I am currently writing my Thesis and have set up the following equation
Y = B0 + B1*D1*Post + B2*Size + B3*Age + B4*Industry
Where Y is the dependent variable and D1 = 1 for treatment and 0 for control. The term Post is a dummy variable for time, which is 0 before the treatment and 1 following it. Consequently, B1 is the difference-in-differences estimator. Size, Age, and Industry are included as control variables.
Having looked in other forums here at Statalist, it seems as if a time indicator, i.e. + Post*B5, is often included in the regression in Stata so that the equation would instead become:
Y = B0 + B1*D1*Post + B2*Size + B3*Age + B4*Industry + B5*Post
However, as most research in this area does not include the time indicator, I would like to hear what the "correct" approach would be? Is it "wrong" to leave out the time indicator? And what is the exact difference in the interpretation of the results if including/excluding the time indicator?
I can see that if I include the time indicator, the difference-in-differences estimator will change in some of the tests in STATA, and therefore I would really appreciate any thoughts on the above-mentioned issue.
Best Regards,
James
I am currently writing my Thesis and have set up the following equation
Y = B0 + B1*D1*Post + B2*Size + B3*Age + B4*Industry
Where Y is the dependent variable and D1 = 1 for treatment and 0 for control. The term Post is a dummy variable for time, which is 0 before the treatment and 1 following it. Consequently, B1 is the difference-in-differences estimator. Size, Age, and Industry are included as control variables.
Having looked in other forums here at Statalist, it seems as if a time indicator, i.e. + Post*B5, is often included in the regression in Stata so that the equation would instead become:
Y = B0 + B1*D1*Post + B2*Size + B3*Age + B4*Industry + B5*Post
However, as most research in this area does not include the time indicator, I would like to hear what the "correct" approach would be? Is it "wrong" to leave out the time indicator? And what is the exact difference in the interpretation of the results if including/excluding the time indicator?
I can see that if I include the time indicator, the difference-in-differences estimator will change in some of the tests in STATA, and therefore I would really appreciate any thoughts on the above-mentioned issue.
Best Regards,
James
Comment