Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correlation Matrix Restricting Observations

    Hi, I am using the following code to determine correlation between data with non-normal distribution. My code is as follows, but with generic variable names: spearman x1 x2 x3 x4, star(0.05)

    I am encountering an issue of restricted observations for this test because x4 has less observations than x1, x2, x3. To my understanding, because spearman is a rank correlation, this restriction is mathematically necessary. However, I would like to include my full set of observations because my dataset is limited in size and x4 is of interest. Is there a way to get around this statistically using rank correlation?

  • #2
    your situation is not completely clear to me, but looking at the help file, I see the "pw" option which I think is what you want; see
    Code:
    help spearman

    Comment


    • #3
      I agree with Rich Goldstein. You seem to be asking for pairwise calculations,n

      The restriction you mention would bite with any kind of correlation. If one variable has missing values where the other doesn't, those observations cannot be included in the calculation. This applies with measurements and ranks alike.

      Comment


      • #4
        Thank you Rich and Nick. Nick - your comment addresses my concern. I will just live with limited observations. My data is not paired - I apologize for the confusion there. Great to know about the pairwise function at any rate!

        Comment


        • #5
          I don't follow your comment on pairing. Any kind of correlation requires two variables to be measured for each observation, e.g. person or place or time.

          FWIW, I don't think having non-normal marginal distributions is much of a barrier to using Pearson correlation. When the latter is useful, it is a measure of linearity. If you want P-values or confidence intervals use simulation or bootstrapping. If those don't apply (e.g. independence is violated) then that is a bigger deal than not being normally distributed and inference is probably off the table any way. If correlation is bumped up (or down) by outliers and/or skewness, then that is what it is and/or means you would be better off on a transformed scale.

          Comment

          Working...
          X