Testing for autocorrelation in cross-sectional datasets

Anna Cusimano

Join Date: Jul 2024

Posts: 13
#1

Testing for autocorrelation in cross-sectional datasets

10 Oct 2024, 10:55

Hi everyone,
I am working with a dataset capturing individuals’ attitudes at one point in time (individual identifier: pid). The dataset samples individuals within households, so that each adult member of the household is a respondent (household identifier: hid). I am using the regress command to estimate the effect of respondent’s characteristics on their attitudes.
I have two questions:
Is there a way to test whether there is autocorrelation in the data between members of the same households?

In case I identify autocorrelation: the data is also heteroskedastic. Should I use vce(cluster clustervar) or vce(robust) in order to address both issues?

Thank you very much!
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3036
#2

10 Oct 2024, 13:20

how different are the standard, robust, and clustered standard errors?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2081
#3

10 Oct 2024, 22:27

The term "autocorrelation" is not typically used in this setting because it suggests a natural ordering of the data. Instead, it is "cluster correlation." Because the data are first sampled at the household level, you have a cluster sample, and you have little choice but cluster at the household level. If you are including household fixed effects there are assumptions under which clustering is not needed. But you would likely get push back if you don't cluster. Typically one would have a fair number of households, each with a small number of people, in which case vce(cluster hid) is appropriate. With regress, vce(robust) isn't enough.
2 likes
Comment
Anna Cusimano

Join Date: Jul 2024

Posts: 13
#4

18 Oct 2024, 05:15

Robust and clustered standard error are almost identical to each other (and very different from the normal ones), but if clustered SE account for both heteroskedasticity and cluster correlation, I am more confident the analysis is robust to different violations. Thank you for the suggestion!
Just a short follow-up question: the option vce(cluster hid) does not allow the option beta. I get the following error message:

options vce(cluster clustvar) and beta may not be combined
r(184);

However, since vce(robust) or vce(cluster hid) only change standard errors and not coefficients, I could simply run the regression with vce(robust) and note the standardized regression coefficients from this output, right?

Last edited by Anna Cusimano; 18 Oct 2024, 05:39.
Comment

George Ford

Join Date: Aug 2014
Posts: 3036

18 Oct 2024, 09:33

go through this and you'll see a path forward.

Code:

sysuse auto, clear
drop if mi(rep78)

egen price_s = std(price)
egen weight_s = std(weight)
egen mpg_s = std(mpg)

reg price weight mpg , beta
reg price_s weight_s mpg_s , 

reg price weight mpg , vce(cluster rep78)
reg price_s weight_s mpg_s , vce(cluster rep78)

summ price 
local ysd = r(sd)
summ weight
local wsd = r(sd)
summ mpg
local msd = r(sd)

reg price weight mpg , vce(cluster rep78)
di _b[weight]*`wsd'/`ysd'
di _b[mpg]*`msd'/`ysd'

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17602
#6

18 Oct 2024, 09:58

Anna:
assuming, as per George's help'ful reply, that you are using -regress-, the option -robust- take heteroskedasticity only into account.
If you detected serial correlation of the epsilon or serial correlation+heteroskedasticity, you should go -vce(cluster clusterid)-, as reported in Stata Bookstore: Environmental Econometrics Using Stata , page 29.
In this respect -regress- differs from -xtreg-, as in the latter the options -robust- and -vce(cluster clusterid)- noth trigger the cluster-robust standard error.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Anna Cusimano

Join Date: Jul 2024

Posts: 13
#7

21 Oct 2024, 11:12

Thank you very much for the explanations!
Comment

Announcement

Testing for autocorrelation in cross-sectional datasets

Comment

Comment

Comment

Comment

Comment

Comment