Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • saving correlation matrices with corrci

    There are quite often requests on Statalist for ways to export correlation matrices as produced in Stata by correlate or pwcorr.

    It is not always clear quite (1) what file format (2) what recipient software (3) what layout (4) what purpose is in mind, but many such requests seem close to wanting to display matrices in similar form in MS Word or MS Excel or possibly LaTeX or HTML. Rightly or wrongly, I often suspect some ritual display of matrices in Appendices that supervisors or reviewers insist on but almost no-one ever reads.

    Be that as it may, many of the major tabulation or reporting commands will do a good job for you.

    The point of this post is just to publicise a different way of exporting sets of correlations as Stata datasets in ways that might allow or even encourage further analyses, even just graphics.

    corrci was written in the first instance to support confidence intervals for (Pearson) correlations using the atanh() transformation, often called Fisher's z transformation, but other bells and whistles were added at the time and later.


    As is typical for such efforts, anyone interested in corrci is asked to look at the original paper but to use the latest version of the software (as I write, 2021).


    Code:
    . search corrci, sj
    
    Search of official help files, FAQs, Examples, and Stata Journals
    
    SJ-21-3 pr0041_4  . . . . . . . . . . . . . . . . . Software update for corrci
            (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
            Q3/21   SJ 21(3):847
            improves explanation of the format() option and fixes a bug
            concerning saving results to a new dataset
    
    SJ-20-4 pr0041_3  . . . . . . . . . . . . . . . . . Software update for corrci
            (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
            Q4/20   SJ 20(4):1028--1030
            corrects code for a bias correction used if (and only if) the
            fisher option is specified
    
    SJ-17-3 pr0041_2  . . . . . . . . . . . . . . . . . Software update for corrci
            (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
            Q3/17   SJ 17(3):779
            new options added
    
    SJ-10-4 pr0041_1  . . . . . . . . . . . . . . . . . Software update for corrci
            (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
            Q4/10   SJ 10(4):691
            update to fix corrci so that it always saves r-class results
    
    SJ-8-3  pr0041  .  Speaking Stata: Corr. with confidence, Fisher's z revisited
            (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
            Q3/08   SJ 8(3):413--439
            reviews Fisher's z transformation and its inverse, the
            hyperbolic tangent, and reviews their use in inference
            with correlations

    Here is a token example, which extends to a quite common request to export several correlation matrices, typically one for each group of observations.

    So, the idea is just to loop over each group and save each new set of results to a new dataset. Then loop over the datasets and put them together with append.

    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    . forval x = 0/1 {
      2. corrci mpg weight displacement, saving(corr`x')
      3. }
    
    (obs=74)
    
                               correlations and 95% limits
    mpg          weight           -0.807   -0.874   -0.710
    mpg          displacement     -0.706   -0.804   -0.569
    weight       displacement      0.895    0.838    0.933
    
    (obs=74)
    
                               correlations and 95% limits
    mpg          weight           -0.807   -0.874   -0.710
    mpg          displacement     -0.706   -0.804   -0.569
    weight       displacement      0.895    0.838    0.933
    
    . use corr0 , clear
    
    . gen foreign = 0
    
    . l
    
         +---------------------------------------------------------------------+
         |   var1           var2           r       lower       upper   foreign |
         |---------------------------------------------------------------------|
      1. |    mpg         weight   -.8071749   -.8744006   -.7095432         0 |
      2. |    mpg   displacement   -.7056426   -.8044354   -.5688671         0 |
      3. | weight   displacement    .8948958    .8376901    .9326781         0 |
         +---------------------------------------------------------------------+
    
    . forval x = 1/1 {
      2. append using corr`x'
      3. replace foreign = `x' if missing(foreign)
      4. }
    (3 real changes made)
    
    . l
    
         +---------------------------------------------------------------------+
         |   var1           var2           r       lower       upper   foreign |
         |---------------------------------------------------------------------|
      1. |    mpg         weight   -.8071749   -.8744006   -.7095432         0 |
      2. |    mpg   displacement   -.7056426   -.8044354   -.5688671         0 |
      3. | weight   displacement    .8948958    .8376901    .9326781         0 |
      4. |    mpg         weight   -.8071749   -.8744006   -.7095432         1 |
      5. |    mpg   displacement   -.7056426   -.8044354   -.5688671         1 |
         |---------------------------------------------------------------------|
      6. | weight   displacement    .8948958    .8376901    .9326781         1 |
         +---------------------------------------------------------------------+
    The result is trivial here, but the point is that more correlations and/or more groups wouldn't imply code that was much more complicated.

    I'd particular commend graph dot as lending itself to easy and effective display (including comparison) of correlations.

    To address one point before it is raised, corrci is not supportive of pairwise correlations in the sense of pwcorr, which I tend to distrust. It wouldn't be very difficult to clone corrci and edit the codeto support that, but such functionality is not on my to-do list.
    Last edited by Nick Cox; 17 Jan 2024, 06:28.

  • #2
    The code in #1 is stupid because it should be

    Code:
     
     corrci mpg weight displacement if foreign == `x', saving(corr`x')
    -- otherwise you just get the same correlations -- but the principle is still good!

    Comment

    Working...
    X