Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Accessing Tuples to Name Matrix Rows

    Hello everyone,

    I am not a stellar Stata18 user, so forgive me if my question is dumb.

    I have a database with several variables x1 x2 x3 x4 x5 ... x43 and I want to perform a cox regression (stcox) on all possible combinations of 3 variables. To do so, I implemented successfully the tuples command as:

    tuples x1 x2 x3 ... x43, min(3) max(3)

    which has created for me the 165 possible combinations. Then, I want to create a 165 x 1 matrix to store a value of "1" whenever all p-values associated to the 3 variables of analysis are < 0.05. I have performed this as:

    matrix multivariata = J(165,1,.)

    local j = 1

    forval i = 1/165 {
    qui stcox `tuple`i''
    matrix mytab2 = r(table)
    scalar pvalue1 = mytab2[4,1]
    scalar pvalue2 = mytab2[4,2]
    scalar pvalue3 = mytab2[4,3]
    if pvalue1 < 0.05 & pvalue2 < 0.05 & pvalue3 < 0.05 {
    matrix multivariata[`j', 1] = 1
    }
    local j = `j' + 1
    }

    I now have my matrix of interest multivariata, but I am unable to access the tuples in any meaningful way to label i-th row of my matrix as the variables in the i-th tuple. Is there anyway to do this?

    Any help would be appreciated, thank you very much!

  • #2
    Originally posted by Tommaso Roccuzzo View Post
    To do so, I implemented successfully the tuples command
    tuples is probably from SSC, as you ask to explain.

    Originally posted by Tommaso Roccuzzo View Post
    I am unable to access the tuples in any meaningful way to label i-th row of my matrix as the variables in the i-th tuple. Is there anyway to do this?
    If the combined names of the variables in any 3-tuple are longer than 30 characters (plus the 2 spaces = 32), then you cannot use them as a row name for your matrix. Provided your variable names do not hit that limit, you can do something like

    Code:
    . tuples a b c d e , min(3) max(3)
    
    .
    . matrix foo = J(`ntuples',1,.)
    
    .
    . forvalues i = 1/`ntuples' {
      2.    
    .     local my_rownames = `"`my_rownames' "`tuple`i''""'
      3.    
    . }
    
    .
    . matrix rownames foo = `my_rownames'
    
    . matrix list foo
    
    foo[10,1]
           c1
    c d e   .
    b d e   .
    b c e   .
    b c d   .
    a d e   .
    a c e   .
    a c d   .
    a b e   .
    a b d   .
    a b c   .
    
    .
    end of do-file

    Originally posted by Tommaso Roccuzzo View Post
    Then, I want to create a 165 x 1 matrix to store a value of "1" whenever all p-values associated to the 3 variables of analysis are < 0.05. I have performed this as:
    Not that I am super fanatic about adjusting p-values for multiple comparisons, but you do realize that a p-value of 0.05 essentially says that you are expecting five false positives for any hundred tests? And, you are running 165 tests with a set of three variables each ...

    Comment


    • #3
      1. Something doesn't compute here. If you have 43 variables and you are looking at subsets of 3 distinct variables, then the number of such combinations is 12,341, not 165. And, in fact, the -tuples- command shown does produce 12,341 triples. This makes Daniel Klein's comment about multiple tests even more trenchant.

      2. My advice with regard to O.P.'s primary question, would be not to accumulate these results in a matrix to begin with. Instead, -post- the results for each regression, along with the names of the variables involved, to a -tempfile- or a -frame-, thereby saving the information in a new Stata data set. After that, if there is some compelling reason to put the information in a matrix you can always use -mkmat- to convert that data set to a Stata matrix, or you can make a Mata matrix from the new data set. But honestly, unless you are going to do some actual matrix algebra with these results (and I don't see how they lend themselves to it), I don't see any advantage to having this in matrix form.

      Comment


      • #4
        Dear Daniel,
        Thank you for your reply! I understand your concern regarding the p-values and multiple tests. However, the study is not mine and I am simply performing the analysis I was asked to do...
        I am going to look into your code right away and hope it works within the characters' limit.

        Comment


        • #5
          Dear Clyde,

          Tank you for your reply! You are absolutely right, many apologies for the mistake on my side. The variables of the model are indeed 43, but I only have to perform the combinatory on 11 of them, thus, the resulting combinations of 3 are indeed 165. The reason for the matrix is simply that I am used to store outputs in matrices and export them to excel using putexcel. Nevertheless, I will surely look into your suggestions and see if I can learn something useful and new!

          Comment


          • #6
            If you collect the data in a data set, as I suggested in #3, you can export that data to Excel using -export excel-.

            Comment

            Working...
            X