Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • results missing from correlation matrix?

    Hi guys I am not sure why STATA dropped some variables in the corr table:as attached

    the second pic is about me trying to see whether one variable is a subset of the other: but I dont think so, right?

    Can someone help? thx

    J


    Attached Files

  • #2
    First, please read the Forum FAQ for guidance about the helpful way to show Stata output, which is the use of code delimiters. Note also that screenshots are specifically discouraged. Your screenshot is not readable on my setup, and probably on that of some others--they often are. That's why we ask people not to use them.

    Nevertheless, without being able to see your output, you will get a missing value for a correlation coefficient if a variable is actually constant. When I say "actually constant" I mean by that that it is constant in those observations that are used in the corr command. -corr- is an estimation command, like, say, -regress-, and it leaves out observations where any of the variables has a missing value. So, even if a certain variable is not a constant in the complete data set, if it is a constant in those observations where all of the variables mentioned in your -corr- command are non-missing, you will see missing values for correlation coefficients.

    The missing value for correlation, by the way, is math, not some peculiarity of Stata. If you look at the formula for the correlation coefficient in a basic statistics text, you will see that it is a ratio, and the denominator is the product of the standard deviations of the variables. So if one of the variables has a standard deviation of 0, the correlation coefficient is undefined.

    Comment


    • #3
      And another fun fact which is specific to Stata is that -correlate- operates on the set where all of the listed variables are not missing, while -pwcorr- utilises sets where the pairs are not missing.

      So you can easily check what Clyde suggests by doing what you are doing now with -correlate-, however instead through -pwcorr-.

      Here:

      Code:
      . sysuse auto, clear
      (1978 Automobile Data)
      
      . correlate mpg head rep
      (obs=69)
      
                   |      mpg headroom    rep78
      -------------+---------------------------
               mpg |   1.0000
          headroom |  -0.3996   1.0000
             rep78 |   0.4023  -0.1480   1.0000
      
      
      . pwcorr mpg head rep, obs
      
                   |      mpg headroom    rep78
      -------------+---------------------------
               mpg |   1.0000 
                   |       74
                   |
          headroom |  -0.4138   1.0000 
                   |       74       74
                   |
             rep78 |   0.4023  -0.1480   1.0000 
                   |       69       69       69

      Comment


      • #4
        In addition to excellent advice already given, I suggest that your output would be less puzzling if you got Stata [NB spelling] to show a scatter plot matrix. That's just the list of variables you entered into corr (I can't copy and paste from your attachment) but with

        Code:
        graph matrix
        as the command. Pairs of variables constant both ways will show as points and variables in which one but not the other is constant will show as single stripes. In both situations the correlation is indeterminate.

        Comment

        Working...
        X