  • Bootstrapping clustered standard errors with a generated regressor


    I try to get standard errors clustered at the province level using the bootstrap method.
    My data is household-level panel data (e.g. y_{ist}, x1_{ist}, x2_{ist}) merged with province-level panel data (e.g. X_{st}, Z_{st}), where i: household, s: province, and t: year.

    My code looks like this:

    capture program drop one_rep
    program define one_rep
    reg X Z x1 x2 i.province i.year, vce(cluster province)
    capture drop X_hat
    predict double X_hat
    qui reg y X_hat x1 x2 i.province i.year if low_income==1, vce(cluster province)
    bootstrap, seed(4567) strata(province) reps(1000): one_rep
    In the 1st regression, I use all households and predict X_hat.
    In the 2nd regression, I use only households with income lower than the median.

    1. Is this a correct way to get clustered standard errors at the province level?
    - When I test the above code with "ivregress 2sls" for the same full households in both stages, the bootstrap SE is wider than ivregress 2sls clustered SE.

    2. When I run the following code, I got an error message: repeated time values within panel the most likely cause for this error is misspecifying the cluster(), idcluster(), or group() option.
    bootstrap, seed(4567) cluster(province) idcluster(newid) reps(1000) noisily: one_rep
    3. I tried to run the following code but got another error message: collinearity in replicate sample is not the same as the full sample, posting missing values.
    bootstrap, seed(4567) cluster(province) idcluster(newid) group(hhid) reps(1000) noisily: one_rep
    This fails for all draws.

    4. The following code runs without error messages. But this generates standard errors clustered at the province*year level, right?
    bootstrap, seed(4567) cluster(province year) idcluster(newid) group(hhid) reps(1000) noisily: one_rep

    Last edited by Seongeun Kim; 24 Feb 2022, 02:09.