Hello everybody,
I am currently working with a cross-sectional dataset of students in American high schools, which follows the following sampling process: In a first stage, a determined number of schools is randomly sampled from a population of schools (first cluster unit), and in a second stage, a determined number of individuals are randomly sampled within each school. I want to estimate the peer effect on an individual outcome through an Ordinary Least Square regression. The outcome is the number of days per week each individual do physical exercise, and the peer effect is the leave-one-out mean of the number of days per week the grade mates do exercise. That's mean the following: the peer effect variable for individual i, who belongs to school j and grade 9, for example, is made up by his grade mates' average number of days per week doing physical exercise. And here my doubt is raised. Since my treatment (peer effect) is assigned at school-grade level, a different cluster unit than the sampling unit (school), I am afraid that if I cluster my errors at school level, what I should do following standard literature on sampling, they could be too much conservative to estimate the treatment, which varies at school-grade level.
Having read the recent paper: "When should you adjust standard errors for clustering?", published in The Quarterly Journal of Economics in February, 2023, I came to terms with the idea of clustering at my treatment assignment level (school-grade), but I am still a bit unsure about the right thing to do. I have 120 schools and roughly 360 school-grade groups, and obviously the average number of observations per school-grade cluster is smaller than the average number of observations per school, which affects the asymptotic efficiency of the variance of the residuals, if I not mistaken. I included in my regression school and grade fixed effects to mitigate as much as possible the self-selection problem.
Does anyone have any advice? Am I misunderstanding something about it?
Any feedback will be highly appreciated. Thanks a lot in advance.
Best regards,
Daniel
I am currently working with a cross-sectional dataset of students in American high schools, which follows the following sampling process: In a first stage, a determined number of schools is randomly sampled from a population of schools (first cluster unit), and in a second stage, a determined number of individuals are randomly sampled within each school. I want to estimate the peer effect on an individual outcome through an Ordinary Least Square regression. The outcome is the number of days per week each individual do physical exercise, and the peer effect is the leave-one-out mean of the number of days per week the grade mates do exercise. That's mean the following: the peer effect variable for individual i, who belongs to school j and grade 9, for example, is made up by his grade mates' average number of days per week doing physical exercise. And here my doubt is raised. Since my treatment (peer effect) is assigned at school-grade level, a different cluster unit than the sampling unit (school), I am afraid that if I cluster my errors at school level, what I should do following standard literature on sampling, they could be too much conservative to estimate the treatment, which varies at school-grade level.
Having read the recent paper: "When should you adjust standard errors for clustering?", published in The Quarterly Journal of Economics in February, 2023, I came to terms with the idea of clustering at my treatment assignment level (school-grade), but I am still a bit unsure about the right thing to do. I have 120 schools and roughly 360 school-grade groups, and obviously the average number of observations per school-grade cluster is smaller than the average number of observations per school, which affects the asymptotic efficiency of the variance of the residuals, if I not mistaken. I included in my regression school and grade fixed effects to mitigate as much as possible the self-selection problem.
Does anyone have any advice? Am I misunderstanding something about it?
Any feedback will be highly appreciated. Thanks a lot in advance.
Best regards,
Daniel
Comment