Hi everybody,
I am analyzing the effect of a school construction program on education. I am using individual level panel data (5 waves) matched over the birthplace with the school construction data.
To identify individuals who have been exposed to the program, I am using the variation in year of birth and region of birth. Individuals born between 1968 and 1972 are the treatment group, and cohorts 1958 to 1963 form the control group. I multiply this dummy with the treatment intensity of the school program in each region, calculated as schools built per 1,000 children (youngXnin). I add region of birth and year of birth fixed effects and cluster the standard errors at the region of birth level. Furthermore, I control for the pre-program enrollment rates, number of children and another policy implemented during the same time at the regional level, interacted with the year of birth. I tagged the individuals by their highest years of education (yoe).
I have run the following regression:
My result looks like this:
A lot of studies analyzed the effect of the program on education, e.g.,
Duflo, E. (2001). "Schooling and labor market consequences of school construction in Indonesia: Evidence from an unusual policy experiment." American economic review 91(4): 795-813.
Mazumder, B., et al. (2019). Intergenerational Human Capital Spillovers: Indonesia's School Construction and Its Effects on the Next Generation. AEA Papers and Proceedings.
They all find significant effects of the program on education. Duflo (2001) uses different data, but Mazumder et al. (2019) get data from the same source.
Question:
I am wondering where the differences in the magnitude and the significance of the estimates come from. I checked my data and code several times, but I don't see the problem.
Also, as soon as I add the fixed effects, the estimate on the treatment looses the significance. This would indicate that there is not enough variation between the cohorts and regions, right?
Do you have any ideas?
Any advice is appreciated!
I am analyzing the effect of a school construction program on education. I am using individual level panel data (5 waves) matched over the birthplace with the school construction data.
To identify individuals who have been exposed to the program, I am using the variation in year of birth and region of birth. Individuals born between 1968 and 1972 are the treatment group, and cohorts 1958 to 1963 form the control group. I multiply this dummy with the treatment intensity of the school program in each region, calculated as schools built per 1,000 children (youngXnin). I add region of birth and year of birth fixed effects and cluster the standard errors at the region of birth level. Furthermore, I control for the pre-program enrollment rates, number of children and another policy implemented during the same time at the regional level, interacted with the year of birth. I tagged the individuals by their highest years of education (yoe).
I have run the following regression:
Code:
areg yoe youngXnin i.yob i.yob i.yob#c.en71 i.yob#c.ch71 i.yob#c.wsppc female if tag==1, abs(birthpl) cluster(birthpl)
Code:
note: 1972.yob#c.en71 omitted because of collinearity note: 1972.yob#c.ch71 omitted because of collinearity note: 1972.yob#c.wsppc omitted because of collinearity Linear regression, absorbing indicators Number of obs = 5,986 Absorbed variable: birthpl No. of categories = 242 F( 42, 241) = 35.82 Prob > F = 0.0000 R-squared = 0.2778 Adj R-squared = 0.2420 Root MSE = 3.4042 (Std. Err. adjusted for 242 clusters in birthpl) ------------------------------------------------------------------------------ | Robust yoe | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- youngXnin | .1817468 .1288603 1.41 0.160 -.0720894 .435583 female | -1.008165 .1054618 -9.56 0.000 -1.21591 -.8004209 | yob | 1958 | -1.454403 .6948977 -2.09 0.037 -2.823252 -.0855548 1959 | 1.006075 .5946716 1.69 0.092 -.1653431 2.177492 1960 | 1.720255 .6202648 2.77 0.006 .4984226 2.942088 1961 | .5529631 .5809999 0.95 0.342 -.5915233 1.697449 1962 | 1.401703 .5385698 2.60 0.010 .3407979 2.462608 1968 | 2.022092 .5760126 3.51 0.001 .8874304 3.156755 1969 | 3.192242 .5136172 6.22 0.000 2.18049 4.203994 1970 | 3.68983 .5406953 6.82 0.000 2.624738 4.754922 1971 | 3.338817 .5653397 5.91 0.000 2.225179 4.452455 1972 | 3.322119 .6062628 5.48 0.000 2.127868 4.51637 | yob#c.en71 | 1957 | -3.188886 3.273782 -0.97 0.331 -9.637765 3.259993 1958 | 3.326529 2.634466 1.26 0.208 -1.862991 8.516049 1959 | .0702168 1.809247 0.04 0.969 -3.49374 3.634174 1960 | -2.737319 1.668069 -1.64 0.102 -6.023176 .5485377 1961 | -1.312241 2.450678 -0.54 0.593 -6.139723 3.515242 1962 | .7871137 1.437824 0.55 0.585 -2.045193 3.619421 1968 | 1.592497 1.799092 0.89 0.377 -1.951456 5.13645 1969 | .3691274 1.573141 0.23 0.815 -2.729733 3.467988 1970 | 1.954537 1.64173 1.19 0.235 -1.279435 5.188509 1971 | .1603102 1.921413 0.08 0.934 -3.624596 3.945217 1972 | 0 (omitted) | yob#c.ch71 | 1957 | 5.17e-06 2.41e-06 2.15 0.033 4.30e-07 9.92e-06 1958 | 4.08e-06 2.19e-06 1.86 0.064 -2.43e-07 8.40e-06 1959 | -7.02e-07 2.20e-06 -0.32 0.750 -5.04e-06 3.64e-06 1960 | -6.26e-07 2.18e-06 -0.29 0.774 -4.91e-06 3.66e-06 1961 | 2.50e-06 2.22e-06 1.12 0.262 -1.88e-06 6.88e-06 1962 | -1.28e-06 2.22e-06 -0.58 0.564 -5.66e-06 3.09e-06 1968 | 1.53e-06 2.35e-06 0.65 0.516 -3.11e-06 6.17e-06 1969 | -8.46e-07 2.15e-06 -0.39 0.695 -5.09e-06 3.40e-06 1970 | -1.76e-06 1.95e-06 -0.90 0.369 -5.61e-06 2.09e-06 1971 | -8.76e-07 2.00e-06 -0.44 0.662 -4.82e-06 3.06e-06 1972 | 0 (omitted) | yob#c.wsppc | 1957 | 1.761464 1.080673 1.63 0.104 -.3673056 3.890234 1958 | 2.369885 .9174623 2.58 0.010 .562616 4.177154 1959 | .8455982 .492575 1.72 0.087 -.1247038 1.8159 1960 | .0375649 .4819009 0.08 0.938 -.9117106 .9868404 1961 | 1.813205 .8031514 2.26 0.025 .2311121 3.395298 1962 | .4528254 .3856467 1.17 0.241 -.3068432 1.212494 1968 | .0566797 .4649344 0.12 0.903 -.8591742 .9725336 1969 | .3267577 .306535 1.07 0.288 -.2770721 .9305876 1970 | -.6072225 .4483307 -1.35 0.177 -1.490369 .2759245 1971 | .5620162 .4839056 1.16 0.247 -.3912082 1.515241 1972 | 0 (omitted) | _cons | 5.588359 .5779029 9.67 0.000 4.449974 6.726745 ------------------------------------------------------------------------------ .
A lot of studies analyzed the effect of the program on education, e.g.,
Duflo, E. (2001). "Schooling and labor market consequences of school construction in Indonesia: Evidence from an unusual policy experiment." American economic review 91(4): 795-813.
Mazumder, B., et al. (2019). Intergenerational Human Capital Spillovers: Indonesia's School Construction and Its Effects on the Next Generation. AEA Papers and Proceedings.
They all find significant effects of the program on education. Duflo (2001) uses different data, but Mazumder et al. (2019) get data from the same source.
Question:
I am wondering where the differences in the magnitude and the significance of the estimates come from. I checked my data and code several times, but I don't see the problem.
Also, as soon as I add the fixed effects, the estimate on the treatment looses the significance. This would indicate that there is not enough variation between the cohorts and regions, right?
Do you have any ideas?
Any advice is appreciated!
Comment