Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate a covariance/ correlation between two variables within a group in stata.

    Hello,

    I want to calculate a covariance/ correlation between two variables within a group in stata.

    May data is the following:
    prior_date c_pairs prox
    1980 BE-NL 0.576022
    1980 BE-IT 0.711139
    1980 BE-GB 0.255543
    1980 BE-FR 0.257338
    1980 BE-US 0.07187
    1980 AT-BE 0.799441
    1980 BE-OTHERS 0.69717
    1980 BE-DK 0.585538
    1980 BE-DE 0.076019
    1980 BE-SE 0.686572
    For each c_pairs, I want to calculate a covariance/ corralation between prior_date and prox.

    Could someone help me to do that?

    Thanks

  • #2
    Here's an example using the postfile command.

    Code:
    clear
    sysuse auto
    tempname memhold
    tempfile results
    postfile `memhold' foreign cor cov using "`results'"
    levelsof foreign, local(levels)
    foreach foreign of local levels {
        correlate weight length if foreign == `foreign'
        local cor = r(rho)
        correlate weight length if foreign == `foreign', covariance
        local cov = r(cov_12)
        post `memhold' (`foreign') (`cor') (`cov')
    }
    postclose `memhold'
    use "`results'", clear
    list
    Code:
    . list
    
         +-------------------------------+
         | foreign        cor        cov |
         |-------------------------------|
      1. |       0   .9210244   12838.44 |
      2. |       1   .9105639   5394.719 |
         +-------------------------------+

    Comment


    • #3
      Your example isn't promising as prior_date is constant and there is only one observation for each pair. But guessing at what you want, consider this example using rangestat (SSC).

      Code:
      webuse grunfeld , clear
      rangestat (corr) invest mvalue, int(year 0 0)
      tabdisp year, c(corr_x) format(%4.3f)
      
      --------------------------------
           year |               corr_x
      ----------+---------------------
           1935 |                0.930
           1936 |                0.834
           1937 |                0.808
           1938 |                0.797
           1939 |                0.887
           1940 |                0.906
           1941 |                0.919
           1942 |                0.924
           1943 |                0.914
           1944 |                0.934
           1945 |                0.951
           1946 |                0.946
           1947 |                0.944
           1948 |                0.887
           1949 |                0.928
           1950 |                0.926
           1951 |                0.929
           1952 |                0.918
           1953 |                0.941
           1954 |                0.925
      --------------------------------
      That leads to guess that you want something like

      Code:
      ssc install rangestat
      gen foo = 42
      rangestat (corr) prior_date prox, int(foo 0 0) by(c_pairs)
      Here foo is just to keep rangestat happy.You don't give a decent data example using dataex (do please read and act on FAQ Advice #12) so I can't tell whether c_pairs is string or numeric with value labels.

      You can get covariances too: just read the help file.

      Comment


      • #4
        I had to go read the rangestat help file after this example. In addition to correlation, one can do a user defined Mata function. Pretty neat!

        Comment

        Working...
        X