I've spoken over the past week or so with Stata technical support and here is what I've learned (this post intended for the curious, those experiencing this issue, or those using areg with a bootstrap cluster option).
1. areg's vce(bootstrap) does not function like reg's with a cluster option. In fact, areg adds extra fixed effects not included in the original model, making the results different from regress with the same model or (asymptotically) areg with the vce(cluster ...) option. If there are too many expanded fixed effects, the regression cannot be run as in my case.
Apparently this is intended behavior although I currently cannot see exactly why.
2. In the -bootstrap- prefix, there is no way to disable the check. However, you can write your own program wrapper which essentially circumvents this.
Details (#1):
areg [...], vce(bootstrap, cluster(x)) absorb(y) will effectively add in fixed effects for any combination of x & y (i.e. egen xy = group(x y)) to the model when bootstrapping. So, if you have a panel of individuals, absorb individual fixed effects, and cluster on time you will have a fixed effect for every observations (individual-time fixed effects) and will be unable to run the model. This explains my errors above.
This honestly appears to me to be a bug as areg has now changed the model. regress [...] i.y, vce(boot, cluster(x)) will not have this behavior and so the results diverge. Additionally, the results will not match areg [...], cluster(x) absorb(y) asymptotically. However, I am told that this is intended behavior.
I think the idea was to treat repetitions of the fixed effect as unique in the bootstrap, but I don't see this as being a good solution for the above reasons.
Details (#2):
Stata technical support provided a workaround for "custom" bootstraps to ignore the dropped fixed effect problem. I've included their example code here:
* Example data:
webuse grunfeld, clear
* Using -areg- in a wrapper program for -bootstrap-:
program define myboot, rclass
areg invest mvalue i.year, absorb(company)
local bx = _b[mvalue]
ereturn clear
return scalar b_mvalue = `bx'
end
bootstrap b = r(b_mvalue), reps(10) seed(123) : myboot
1. areg's vce(bootstrap) does not function like reg's with a cluster option. In fact, areg adds extra fixed effects not included in the original model, making the results different from regress with the same model or (asymptotically) areg with the vce(cluster ...) option. If there are too many expanded fixed effects, the regression cannot be run as in my case.
Apparently this is intended behavior although I currently cannot see exactly why.
2. In the -bootstrap- prefix, there is no way to disable the check. However, you can write your own program wrapper which essentially circumvents this.
Details (#1):
areg [...], vce(bootstrap, cluster(x)) absorb(y) will effectively add in fixed effects for any combination of x & y (i.e. egen xy = group(x y)) to the model when bootstrapping. So, if you have a panel of individuals, absorb individual fixed effects, and cluster on time you will have a fixed effect for every observations (individual-time fixed effects) and will be unable to run the model. This explains my errors above.
This honestly appears to me to be a bug as areg has now changed the model. regress [...] i.y, vce(boot, cluster(x)) will not have this behavior and so the results diverge. Additionally, the results will not match areg [...], cluster(x) absorb(y) asymptotically. However, I am told that this is intended behavior.
I think the idea was to treat repetitions of the fixed effect as unique in the bootstrap, but I don't see this as being a good solution for the above reasons.
Details (#2):
Stata technical support provided a workaround for "custom" bootstraps to ignore the dropped fixed effect problem. I've included their example code here:
* Example data:
webuse grunfeld, clear
* Using -areg- in a wrapper program for -bootstrap-:
program define myboot, rclass
areg invest mvalue i.year, absorb(company)
local bx = _b[mvalue]
ereturn clear
return scalar b_mvalue = `bx'
end
bootstrap b = r(b_mvalue), reps(10) seed(123) : myboot
Comment