Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • partial correlation table for many variables

    I am trying to calculate a partial correlations for many variables with regard to one variable while only considering nonmissing observations in terms of both the i and the j and also output all the partial correlations to excel.

    This code is close I think, but not working. There is a problem in the loop. (I am not great at loops.) The error is "too few variables" r(102).

    One can assume the "control" variable is never missing, but sometimes observations in the "varlist" variables are missing in a random way. I know the "webuse auto" dataset doesn't have missing obs, but please pretend that it does.


    Code:
    webuse auto, clear
    
    local varlist mpg trunk weight length  turn     // main variables
    local control price                             // variable to partial out
    
    local nvars : word count `varlist'
    
    matrix pmat = J(`nvars', `nvars', .)
    
    forvalues i = 1/`nvars' {
        local vi : word `i' of `varlist'
        forvalues j = `i'/`nvars' {
            local vj : word `j' of `varlist'
    
            preserve
            quietly keep if !missing(`vi', `vj', `control')
    
            * Run partial correlation
            quietly pcorr `vi' `vj' `control'
    
            * Extract correlation and store in matrix
            matrix temp = r(Rho)
            local val = temp[1,2]
    
            matrix pmat[`i', `j'] = `val'
            matrix pmat[`j', `i'] = `val'
            restore
        }
    }
    
    matrix rownames pmat = `varlist'
    matrix colnames pmat = `varlist'
    
    putexcel set partial_corr_output.xlsx, replace
    putexcel A1 = matrix(pmat), names
    Thanks in advance.


    Last edited by Kyle Smith; 10 Apr 2025, 14:55.

  • #2
    You have two problems in the code, neither of which really pertains to looping per se.

    The first problem is that -pcorr- will not allow you to calculate the partial correlation of a variable with itself. So you need to prevent the loop from running those cases. That is what is causing the (not very clear or helpful) error message you are getting.

    The second problem is that -pcorr- does not return any r(Rho) matrix, so even after you fix the first problem and eliminate that error message, you will just get an empty result. The matrix that holds the partial correlations after -pcorr- is called r(p_corr).

    The minimal fix to these problems is:
    Code:
    webuse auto, clear
    
    local varlist mpg trunk weight length  turn     // main variables
    local control price                             // variable to partial out
    
    local nvars : word count `varlist'
    
    matrix pmat = J(`nvars', `nvars', .)
    
    forvalues i = 1/`nvars' {
        local vi : word `i' of `varlist'
        forvalues j = `i'/`nvars' {
            local vj : word `j' of `varlist'
    
            preserve
            quietly keep if !missing(`vi', `vj', `control')
    
            * Run partial correlation
          if `i' != `j' {
                quietly pcorr `vi' `vj' `control'
            
                * Extract correlation and store in matrix
                matrix temp = r(p_corr)
                local val = temp[1, 1]
    
                matrix pmat[`i', `j'] = `val'
                matrix pmat[`j', `i'] = `val'
            }
            restore
        }
    }
    
    matrix rownames pmat = `varlist'
    matrix colnames pmat = `varlist'
    
    matrix list pmat
    The above code will do what you seek.

    There are other refinements you could make to render the code more efficient:
    1. In your situation, the only command inside the loop that cares about missing values in `vi' `vj' and `control' is the -pcorr- command. But -pcorr- deals with missing values by excluding those observations anyway. Since your -pcorr-command doesn't contain any variables other than `vi', `vj', and `control', you don't actually need to do anything to make the code handle the missing values exactly the way you want.
    2. Using -preserve- and -restore- is going to slow things down because you will be constantly thrashing the disk. If you run this on a large data set with a large number of variables, the slow down will be noticeable. You could eliminate some of the disk write operations by using -restore, preserve- instead of -restore-. But you don't even need any of this at all. Instead of modifying the whole data set to eliminate observations with missing values, leave the data set alone. Nothing in that loop cares whether there are missing values except -pcorr-, which handles them the way you want anyway.
    3. The local macro val is really not needed. You can just directly set -matrix pmat[`i', `j'] = temp[1,1]- directly, and the same for pmat[`j', `i'].
    4. If you like, instead of adding the -if `i' != `j'- guard around the calculations, you could omit that and change the inner -forvalues- loop to start j at i+1 instead of at i.
    So if we add these refinements, we get:
    Code:
    webuse auto, clear
    
    local varlist mpg trunk weight length  turn     // main variables
    local control price                             // variable to partial out
    
    local nvars : word count `varlist'
    
    matrix pmat = J(`nvars', `nvars', .)
    
    forvalues i = 1/`nvars' {
        local vi : word `i' of `varlist'
        forvalues j = `=`i'+1'/`nvars' {
            local vj : word `j' of `varlist'
    
        // NOTE: NO -preserve- AND -restore- NEEDED
    
            * Run partial correlation
            quietly pcorr `vi' `vj' `control'
        
            * Extract correlation and store in matrix
            matrix temp = r(p_corr)
            matrix pmat[`i', `j'] = temp[1, 1]
            matrix pmat[`j', `i'] = temp[1, 1]
        }
    }
    
    matrix rownames pmat = `varlist'
    matrix colnames pmat = `varlist'
    
    matrix list pmat

    Comment


    • #3
      Clyde Schechter : You're the man. TYVM. You have saved me many times. Very grateful for your help and clarification.

      Comment

      Working...
      X