Robust inference with clustered data, but very few clusters

April Ma

Join Date: Apr 2014

Posts: 8
#1

Robust inference with clustered data, but very few clusters

03 Apr 2014, 03:37

Dear all,

I want to run regressions with clustered data. My problem is that I have only about 30 clusters. In addition, my independent variable is a ratio between 0 and 1 and 40% of the observations take a value equal to 0 or 1. Therefore, I use the fractional response model, which is nonlinear. When I use the following command, the significance of my coefficient estimate is pretty bad.

glm y x1 x2 x3 ... x8, fam(bin) link(logit) cluster (district)

Clearly, I should deal with the few-cluster problem. Cameron, Gelbach and Miller (2008) suggest that a wild cluster bootstrap procedure is best for the case of few clusters, or alternatively, use a T distribution rather than the standard normal for the Wald statistic. However, I am not sure how to implement it using Stata. I found that there used to be a user-written program cgmwildboot.ado, which has been mentioned in some papers. Is it written by Professor Judson Caskey? I could not open the website he provided, probably because I am located in China now and there is always some internet block by the government. I am wondering if anyone could kindly provide me with the cgmwildboot.ado file (and other related help files) or instruct me how to adjust the results to a T distribution in Stata? The default of the regression command is normal distribution. Since my case is not a linear model, can I directly apply these procedures? Thank you very much for your help!

Best,
April
Tags: None
Stephen Jenkins

Join Date: Apr 2014

Posts: 1422
#2

03 Apr 2014, 04:16

I recommend that you look at the state-of-the-art review by Cameron and Miller, "A Practitioner's Guide to Cluster-Robust Inference", downloadable from Cameron's home page at http://cameron.econ.ucdavis.edu/research/papers.html (the paper is forthcoming in the Journal of Human Resources). There is discussion of the wild cluster bootstrap, and also ad hoc methods that are computationally much simpler. Judson Caskey's programs appear to be located at https://sites.google.com/site/judsoncaskey/data (I did a simple Google search on his name). On a closely related theme (small number of units), you might also be interested in "‘Regression analysis of cross-national differences using multi-level data: a cautionary tale’, IZA Discussion Paper No. 7491, http://ftp.iza.org/dp7491.pdf
2 likes
Comment
April Ma

Join Date: Apr 2014

Posts: 8
#3

03 Apr 2014, 07:34

Thank you, Professor Jenkins! For the cgmwildboop.ado, the link you sent me is exactly what I found and what I could not open in China (Chinese govenment blocks google). However, at least I know that this program is a good one to use and I will find friends outside China to download it. I will also explore the literature you mentioned to understand this issue better. I appreciate your help!

Best,
April
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#4

03 Apr 2014, 08:22

The IZA paper to which you refer is dp7583. April, just change 7491 to 7583 in the link provided by Stephen.
Comment
April Ma

Join Date: Apr 2014

Posts: 8
#5

04 Apr 2014, 07:06

Hi edesouza, thank you for the note! I am new to the forum, but just realized that this is a great community!
Comment

Announcement

Robust inference with clustered data, but very few clusters

Comment

Comment

Comment

Comment