Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing for autocorrelation in cross-sectional datasets

    Hi everyone,
    I am working with a dataset capturing individuals’ attitudes at one point in time (individual identifier: pid). The dataset samples individuals within households, so that each adult member of the household is a respondent (household identifier: hid). I am using the regress command to estimate the effect of respondent’s characteristics on their attitudes.
    I have two questions:
    • Is there a way to test whether there is autocorrelation in the data between members of the same households?
    • In case I identify autocorrelation: the data is also heteroskedastic. Should I use vce(cluster clustervar) or vce(robust) in order to address both issues?
    Thank you very much!

  • #2
    how different are the standard, robust, and clustered standard errors?

    Comment


    • #3
      The term "autocorrelation" is not typically used in this setting because it suggests a natural ordering of the data. Instead, it is "cluster correlation." Because the data are first sampled at the household level, you have a cluster sample, and you have little choice but cluster at the household level. If you are including household fixed effects there are assumptions under which clustering is not needed. But you would likely get push back if you don't cluster. Typically one would have a fair number of households, each with a small number of people, in which case vce(cluster hid) is appropriate. With regress, vce(robust) isn't enough.

      Comment


      • #4
        Robust and clustered standard error are almost identical to each other (and very different from the normal ones), but if clustered SE account for both heteroskedasticity and cluster correlation, I am more confident the analysis is robust to different violations. Thank you for the suggestion!
        Just a short follow-up question: the option vce(cluster hid) does not allow the option beta. I get the following error message:

        options vce(cluster clustvar) and beta may not be combined
        r(184);


        However, since vce(robust) or vce(cluster hid) only change standard errors and not coefficients, I could simply run the regression with vce(robust) and note the standardized regression coefficients from this output, right?
        Last edited by Anna Cusimano; 18 Oct 2024, 04:39.

        Comment


        • #5
          go through this and you'll see a path forward.

          Code:
          sysuse auto, clear
          drop if mi(rep78)
          
          egen price_s = std(price)
          egen weight_s = std(weight)
          egen mpg_s = std(mpg)
          
          reg price weight mpg , beta
          reg price_s weight_s mpg_s , 
          
          reg price weight mpg , vce(cluster rep78)
          reg price_s weight_s mpg_s , vce(cluster rep78)
          
          summ price 
          local ysd = r(sd)
          summ weight
          local wsd = r(sd)
          summ mpg
          local msd = r(sd)
          
          reg price weight mpg , vce(cluster rep78)
          di _b[weight]*`wsd'/`ysd'
          di _b[mpg]*`msd'/`ysd'

          Comment


          • #6
            Anna:
            assuming, as per George's help'ful reply, that you are using -regress-, the option -robust- take heteroskedasticity only into account.
            If you detected serial correlation of the epsilon or serial correlation+heteroskedasticity, you should go -vce(cluster clusterid)-, as reported in Stata Bookstore: Environmental Econometrics Using Stata , page 29.
            In this respect -regress- differs from -xtreg-, as in the latter the options -robust- and -vce(cluster clusterid)- noth trigger the cluster-robust standard error.
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Thank you very much for the explanations!

              Comment

              Working...
              X