Hi everyone,
I'm trying to implement a difference-in-difference (DD) analysis using some panel data. I inserted a sample of my data below. My data contains information on about 2000 microentrepreneurs (id) over 2 time periods (wave). I would like to determine the effect of the lockdown length (due to covid) on the uptake of Mobile Money (MM) (ie financial transactions via the smartphone). I use Stata 15.
I have a variable called q35_covid which contains information on the length of the lockdown for the respective entrepreneur in weeks. As logically there is only data on the length of the lockdown after covid hit (ie only in wave 2) I let q35_covid be zero for wave 1. I hope that does not cause any problems.
anyMMactivity_conistent measures if an individual uses MM or not and is thus binary.
I created a pre_post variable which is 0 for wave two and one for wave 1.
Furthermore, I created a treatment variable (called treatment_2) which is 0 if zero weeks were spent in lockdown and 1 if >0 weeks were spent in lockdown.
I first set up DD in a pooled OLS way ...
... which gives me the following output:
I have the following problems. First, the coefficients appear non-sensical to me which makes me wonder if I interpret them correctly. Secondly, as the term of interest appears to be insignificant I thought that my design might be flawed. Is it a problem that my outcome variable is binary? Should I use a probit or a logit model? Can a DD design be implemented in probit or logit using Stata?
My third question is regarding the treatment variable. I wonder how it might be possible to exploit the heterogeneity of q35_covid. Is there a DD design with continous treatment in Stata? I tried the following ...
which gives me the following result:
Why is interaction term omitted here?
Looking forward to your answers.
I'm trying to implement a difference-in-difference (DD) analysis using some panel data. I inserted a sample of my data below. My data contains information on about 2000 microentrepreneurs (id) over 2 time periods (wave). I would like to determine the effect of the lockdown length (due to covid) on the uptake of Mobile Money (MM) (ie financial transactions via the smartphone). I use Stata 15.
Code:
. list id wave pre_post anyMMactivity_conistent q35_covid treatment_2 if id <= 10 +-------------------------------------------------------+ | id wave pre_post anyMMa~t q35_co~d treatm~2 | |-------------------------------------------------------| 1. | 1 1 1 1 0 1 | 2. | 1 2 0 1 12 1 | 3. | 2 1 1 0 0 1 | 4. | 2 2 0 1 20 1 | 5. | 3 1 1 1 0 1 | |-------------------------------------------------------| 6. | 3 2 0 1 1 1 | 7. | 4 1 1 1 0 1 | 8. | 4 2 0 1 8 1 | 9. | 5 1 1 0 0 1 | 10. | 5 2 0 1 1 1 | |-------------------------------------------------------| 11. | 6 1 1 1 0 0 | 12. | 6 2 0 1 0 0 | 13. | 7 1 1 1 0 1 | 14. | 7 2 0 1 20 1 | 15. | 8 1 1 1 0 1 | |-------------------------------------------------------| 16. | 8 2 0 1 12 1 | 17. | 9 1 1 0 . 0 | 18. | 9 2 0 . . 0 | 19. | 10 1 1 0 0 1 | 20. | 10 2 0 1 8 1 | +-------------------------------------------------------+
I have a variable called q35_covid which contains information on the length of the lockdown for the respective entrepreneur in weeks. As logically there is only data on the length of the lockdown after covid hit (ie only in wave 2) I let q35_covid be zero for wave 1. I hope that does not cause any problems.
anyMMactivity_conistent measures if an individual uses MM or not and is thus binary.
I created a pre_post variable which is 0 for wave two and one for wave 1.
Furthermore, I created a treatment variable (called treatment_2) which is 0 if zero weeks were spent in lockdown and 1 if >0 weeks were spent in lockdown.
I first set up DD in a pooled OLS way ...
Code:
reg anyMMactivity_conistent i.pre_post##i.treatment_2, r
Code:
. reg anyMMactivity_conistent i.pre_post##i.treatment_2, r //Robust std. errors Linear regression Number of obs = 4,152 F(3, 4148) = 391.99 Prob > F = 0.0000 R-squared = 0.2034 Root MSE = .41437 ------------------------------------------------------------------------------- | Robust anyMMactivi~t | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- 1.pre_post | -.3872571 .0235074 -16.47 0.000 -.4333441 -.3411701 1.treatment_2 | .0714418 .0165586 4.31 0.000 .0389781 .1039054 | pre_post#| treatment_2 | 1 1 | -.0368784 .0279018 -1.32 0.186 -.0915808 .017824 | _cons | .8530466 .0149958 56.89 0.000 .8236469 .8824463 -------------------------------------------------------------------------------
My third question is regarding the treatment variable. I wonder how it might be possible to exploit the heterogeneity of q35_covid. Is there a DD design with continous treatment in Stata? I tried the following ...
Code:
reg anyMMactivity_conistent i.pre_post##c.q35_covid, r
Code:
. reg anyMMactivity_conistent i.pre_post##c.q35_covid, r //Robust std. errors > note: 1.pre_post#c.q35_covid omitted because of collinearity Linear regression Number of obs = 3,950 F(2, 3947) = 490.31 Prob > F = 0.0000 R-squared = 0.1959 Root MSE = .41019 ------------------------------------------------------------------------------ | Robust anyMMactiv~t | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.pre_post | -.3870294 .0152382 -25.40 0.000 -.416905 -.3571539 q35_covid | .0021529 .0008803 2.45 0.015 .000427 .0038787 | pre_post#| c.q35_covid | 1 | 0 (omitted) | _cons | .8872826 .0102726 86.37 0.000 .8671426 .9074226 ------------------------------------------------------------------------------
Looking forward to your answers.
Comment