Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Outreg2 and a foreach loop for univariate cox regression: issues with formatting of output in excel

    Hi guys,
    This is my first time posting on Statalist. I am a doctor and I and use STATA to conduct medical research. I have recently been learning coding in STATA to help me perform statistics for different studies more efficiently, but I am not an expert by any means, and I apologise in advance if my terminology is incorrect throughout this post. Survival analysis and cox regression is key to medical research. I perform univariate cox regression over 50+ variables, before then using stepwise backwards multivariate cox regression to build a model.

    When performing univariate cox regression, I used to run regress for each variable manually, and copy and paste each regression output into excel (which you can imagine is very tedious). I have since been trying to automate this process, and have since discovered foreach loops which has made this possible.I have been trying to get my results to automatically copy into an excel spreadsheet, to save me even more time. I have been using the outreg2 function to do this. However, this leads to data being displayed diagonally, rather than horizontally across a column as I would expect. I was wondering if anyone was able to help with formatting the output so that the data runs horizontally rather than diagonally, with variable names as the header for each column? I would prefer to have variable names listed down the first column with data running across a row, but I am unsure if this is possible with the outreg2 function.

    I would really appreciate any help! If I haven’t formatted something correctly, please let me know. My code (scaled down for only 3 variables for simplicity) is enclosed below. I have attached images of my excel output which include my current excel output, the output I am trying to achieve and the output I would ideally want (but I am unsure if it is possible), respectively, from left to right.

    Thanks in advance for your time and consideration.

    Kind regards,
    Duji

    Code:
    foreach i in sex hb platelet inr {
    stcox `i' if final_dx == 1, robust
    outreg2 using booknew.xls, excel append stats(coef se pval ci) eform stnum(replace coef=exp(coef), replace se=coef*se)
        }
    Click image for larger version

Name:	my current output in excel.png
Views:	1
Size:	99.9 KB
ID:	1740558 Click image for larger version

Name:	output in excel i am trying to achieve.png
Views:	1
Size:	81.9 KB
ID:	1740559 Click image for larger version

Name:	my ideal excel output, unsure if this is possible.png
Views:	1
Size:	73.5 KB
ID:	1740560




  • #2
    Duji,

    I am not an expert in outreg2 but the most customizable way to get this job done is based on the -collect- system which is available in more recent Stata versions (if you have a older version < Stata ?16, this may not work). I would strongly suggest you invest some time to learn that. To make your life easier for now, I have attached a sample code I crafted several months ago for our standard uni- and multivariable Cox analysis (honestly, my own code looks like a work in progress to me at this time even though I probably spent 30-35 collective hours getting to this point back at the time). Sorry I cannot share sample data; but all predictors are binary and you should be able to replicate this with the public stan3 dataset perhaps and adapt the code appropriately. You should be able to relate these if you are physician too.

    Best,

    Girish


    Code:
    *Set working directory
        cap cd "C:\Users\Vijayalakshmi\Dropbox\Girish Files"
        cap cd "C:\Users\NewPC\Dropbox\Girish Files"
        cap cd "D:\gvenkataraman\Documents\Dropbox\Girish Files"
    
    ****************************************************************************************************
    *Title: Spectrum of TP53-mutated myeloid leukemia: A single institution cohort across 6 years
    ****************************************************************************************************
    // Program: eap53-analysis.do
    // Task   : Uses p53-merge-v2.dta 
    // Outputs: Analysis for the EAP53 communication
    // Project: EAp53 score and outcome 
    // Author : GV \date: 2023-1-04-2023
    
    *Stata versions
        version 17
        clear all
        set linesize 80
        macro drop _all
        set more off
    
        use ".\A_UOC academic\Project Amandeep TP53\p53merged-v2.dta"
    * Check multiple ways that the total is now 125
        qui count
    /*
        assert `r(N)' ==196
    */
        isid mrn
    
    //     Chunk to use only the 100 cases
    preserve
    
    * Save only cases with any missense since we only need those patients
    drop if anymis==0
    count
    assert `r(N)' == 196
    /*
    save ".\Rotation_Harini\p53merged-v2-100.dta", replace
    label data "100 cases with only missense for EAP53"
    compress
    */
    
    * OS24 analysis:
    stset docens, fail(dead)                 origin(time dodx)                     ///
                                            enter(time dodx)                     ///
                                            exit(time dodx + 24 * 30.455)         ///
                                            scale(30.455)
    
    //======================================================================
    
    // Table of mva model
    
    lab var age70 "Age at Dx."
    lab var smotl1 "TP53 mutn. domain"
    
    collect clear
    collect create mv1
    collect _r_b _r_ci _r_p: stcox i.age70 i.nmis i.mono2 i.ep6
    collect _r_b _r_ci: stcox i.age70
    collect _r_b _r_ci: stcox i.nmis
    collect _r_b _r_ci: stcox i.mono2
    collect _r_b _r_ci: stcox i.ep6
    collect _r_b _r_ci: stcox i.del5
    collect _r_b _r_ci: stcox i.loss17
    collect _r_b _r_ci: stcox i.kar50
    collect _r_b _r_ci: stcox i.top10
    collect _r_b _r_ci: stcox i.smotl1
    collect _r_b _r_ci: stcox i.revel95
    collect _r_b _r_ci: stcox i.bayes55
    
    collect style showbase all
    
    // Number formatting
    *-------------------
    collect style cell, nformat(%5.2f)
    collect style cell result[_r_ci], sformat("[%s]") cidelimiter(", ")
    collect style cell border_block, border(right, pattern(nil))
    collect preview
    
    //Var and label name formatting
    *------------------------------
    collect stars _r_p 0.01 "***" 0.05 "** " 0.1 "* " 1 " ", attach(_r_b)
    collect notes : "*** p<.01, ** p<.05, * p<.1"
    collect style cell result[_r_b], halign(left) margin(left, width(10))
    collect style cell result[N], nformat(%4.0f)
    collect style cell colname#cell_type[row-header], font(,bold)
    
    
    collect levels cmdset
    collect addtags uvmodel[2], fortags(cmdset[2 3 4 5 6 7 8 9 10 11 12])
    collect addtags uvmodel[1], fortags(cmdset[1])
    collect levelsof uvmodel
    
    
    //Align CI to the left and relabel some
    *--------------------------------------
    collect style cell result[_r_ci], halign(center)
    collect style cell result[_r_b], halign(left)
    
    collect label levels uvmodel 1 "Multi-variable" 2 "Uni-variable"
    collect label levels age70 0 "<=70 yrs" 1 ">70 yrs", modify
    collect label levels monomore 0 "0-1" 1 "2+", modify
    collect label dim age70 "Age at Dx.", modify
    
    collect label levels result _r_b "HR", modify
    collect label levels result _r_ci "95% CI", modify
    
    collect style column, dups(center) extraspace(2) width(asis)
    collect style row split, dups(first) spacer 
    collect style row stack, nobinder    // indents the factor from variable name
    collect style cell cell_type[column-header]#uvmodel[1] cell_type[column-header]#uvmodel[2], border(bottom, pattern(single))
    collect layout (colname)(uvmodel#result)
    collect preview
    
    putdocx begin
    collect style putdocx, layout(autofitcontents) ///
    title("Table 2: Univ and Multi-variable Cox PH models")
    putdocx collect
    putdocx save ".\Rotation_Harini\Figures and tables\Fig_table_cox.docx", replace
    
    restore
    Example word output (looks strange here, but prettier in the word document version offline)
    Table 2: Univ and Multi-variable Cox PH models
    Uni-variable Multi-variable
    HR 95% CI p-value HR 95% CI p-value
    Age at Dx.
    <=70 yrs 1.00 1.00
    >70 yrs 1.64*** [1.17, 2.28] 0.00 1.49** [1.05, 2.11] 0.03
    TP53 mutn. class
    Missense 1.00 1.00
    Non-Missense 1.03 [0.67, 1.59] 0.89 0.99 [0.63, 1.55] 0.97
    Monosomies
    0-1 Monosomy 1.00 1.00
    2+ Monosomies 1.40* [0.99, 1.98] 0.06 1.54** [1.09, 2.19] 0.02
    EPI6 signature
    Absent 1.00 1.00
    Present 1.72*** [1.24, 2.40] 0.00 1.75*** [1.23, 2.48] 0.00
    del(5q)
    Absent 1.00
    Present 1.39* [0.97, 2.00] 0.07
    Loss of chm.17-New
    0 1.00
    1 1.34* [0.95, 1.91] 0.10
    Clonal cells in Karyotype
    <=50% clonal cells 1.00
    >50% clonal cells 1.53* [0.93, 2.54] 0.10
    Top 10 hotspot mutations
    Non-hotspot 1.00
    Hotspot 0.81 [0.56, 1.18] 0.27
    TP53 mutn. domain
    Non-L1/S/H2 Mutation 1.00
    L1/S/H2 Mutation 1.52** [1.09, 2.13] 0.01
    REVEL score
    Score <=0.95 1.00
    Score > 0.95 1.15 [0.75, 1.77] 0.51
    Bayes deletion score
    Score <=0.55 1.00
    Score > 0.55 1.19 [0.85, 1.66] 0.31
    *** p<.01, ** p<.05, * p<.1


    Comment

    Working...
    X