Dear all,
I have a question regarding clustering standard errors on industry.
I have a cross-sectional dataset of 94 observations (firms) with variables such as EBIT one year before the deal, EBIT one year after the deal, etc.
Furthermore, I added industry and year dummies. The industry dummies are based on the NACE Rev. 2 industry division (for example industry division C consists of NACE Rev.2 codes between 1000-3300).
When running the OLS multivariate regressions, want to cluster standard errors on industry to prevent industry shocks or influencing the standard errors (using vce(cluster variable). However, does this needs to be done on a four-digit NACE Rev.2 level (around 45-55 clusters depending on the dependent variable measure) or on a industry division level (as the dummies) (13 clusters)? I have read that one of the problems of using a few clusters is that OLS leads to “overfitting”, with estimated residuals systematically too close to zero compared to the true error terms. This leads to a downwards-biased cluster-robust variance matrix estimate.
Kind Regards,
Arno Meijer
I have a question regarding clustering standard errors on industry.
I have a cross-sectional dataset of 94 observations (firms) with variables such as EBIT one year before the deal, EBIT one year after the deal, etc.
Furthermore, I added industry and year dummies. The industry dummies are based on the NACE Rev. 2 industry division (for example industry division C consists of NACE Rev.2 codes between 1000-3300).
When running the OLS multivariate regressions, want to cluster standard errors on industry to prevent industry shocks or influencing the standard errors (using vce(cluster variable). However, does this needs to be done on a four-digit NACE Rev.2 level (around 45-55 clusters depending on the dependent variable measure) or on a industry division level (as the dummies) (13 clusters)? I have read that one of the problems of using a few clusters is that OLS leads to “overfitting”, with estimated residuals systematically too close to zero compared to the true error terms. This leads to a downwards-biased cluster-robust variance matrix estimate.
Kind Regards,
Arno Meijer
Comment