Dear all
xtcdf is now available on ssc, with thanks as always going out to Kit Baum. It performs the Pesaran (2004) CD-test for cross sectional dependence, which can be used to test whether your variables or residuals are correlated between groups in a panel setting. E.g. does employment in the different US states follow similar trends or do they move independently? Being aware of such correlations might have an impact on your estimates (see the cross sectional dependence literature), but it's also just good to know these things about your data. The program also reports the mean correlation coefficient.
In the end, the CD test is based on a transformation of the sum of all pairwise correlations. The code is heavily based on -pwcorrf- (also on ssc), which calculates all these pairwise correlations in a more convenient and usually faster way than the official pwcorr command.
You can specify as many variables as you want (it loops internally).
Example usage (output at bottom of post)
This is not the only command out there to calculate cd-tests.
xtcsd was the first, but can only be used as a postestimation command
xtcd is very slow in large datasets and reports the wrong average number of joint observations
xtcd2 does not allow for multiple variables and assumes zero-mean variables (residuals)
Moreover, xtcdf is built to handle panels where some groups do not share (at least 3) observations and hence lead to meaningless correlations. At the time of writing, the other commands behave erratically in this context. On the other hand, xtcd2 can produce kernel densities and a histogram of the pairwise correlations, which can be very useful.
References
Pesaran, M. Hashem. 2004. “General Diagnostic Tests for Cross Section Dependence in Panels.” CESifo Group Munich CESifo Working Paper Series 1229

xtcdf is now available on ssc, with thanks as always going out to Kit Baum. It performs the Pesaran (2004) CD-test for cross sectional dependence, which can be used to test whether your variables or residuals are correlated between groups in a panel setting. E.g. does employment in the different US states follow similar trends or do they move independently? Being aware of such correlations might have an impact on your estimates (see the cross sectional dependence literature), but it's also just good to know these things about your data. The program also reports the mean correlation coefficient.
In the end, the CD test is based on a transformation of the sum of all pairwise correlations. The code is heavily based on -pwcorrf- (also on ssc), which calculates all these pairwise correlations in a more convenient and usually faster way than the official pwcorr command.
You can specify as many variables as you want (it loops internally).
Example usage (output at bottom of post)
Code:
sysuse xtline1.dta, clear xtcdf calories pwcorrf calories, reshape
xtcsd was the first, but can only be used as a postestimation command
xtcd is very slow in large datasets and reports the wrong average number of joint observations
xtcd2 does not allow for multiple variables and assumes zero-mean variables (residuals)
Moreover, xtcdf is built to handle panels where some groups do not share (at least 3) observations and hence lead to meaningless correlations. At the time of writing, the other commands behave erratically in this context. On the other hand, xtcd2 can produce kernel densities and a histogram of the pairwise correlations, which can be very useful.
References
Pesaran, M. Hashem. 2004. “General Diagnostic Tests for Cross Section Dependence in Panels.” CESifo Group Munich CESifo Working Paper Series 1229
Comment