Dear all,
I read some posts here about clustering and what I understood is that we need to consider correlations between errors when deciding on that.
I read a paper studying education outcomes of students in different grades over time. So, their data is constructed such as students are observed over the years when they pass from one grade to another and these students are enrolled in different schools.
Their model is as follows:
where i : student id, g: grade, s: school, and t: year.
lamda_s,t : schoolXyear fixed effects
mu_g : grade fixed effects
They cluster standard errors at the schoolXgradeXyear level. However, I am not convinced with this clustering levels. One may argue that errors are correlated for a given student over years and over grades, no? so the only clustering possible is at the school level, right? Am I missing something.
Please let me know what you think.
All the best
I read some posts here about clustering and what I understood is that we need to consider correlations between errors when deciding on that.
I read a paper studying education outcomes of students in different grades over time. So, their data is constructed such as students are observed over the years when they pass from one grade to another and these students are enrolled in different schools.
Their model is as follows:
Code:
y_i,g,s,t = beta1 X_g,s,t + beta2 Z_i,t + lamda_s,t + mu_g + epsilon_i,g,s,t
lamda_s,t : schoolXyear fixed effects
mu_g : grade fixed effects
They cluster standard errors at the schoolXgradeXyear level. However, I am not convinced with this clustering levels. One may argue that errors are correlated for a given student over years and over grades, no? so the only clustering possible is at the school level, right? Am I missing something.
Please let me know what you think.
All the best
Comment