Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a single descriptive statistics table for a range of variables using a foreach loop

    I have a healthcare dataset that codes a principal diagnosis using a variable 'I10_DX1' and secondary diagnoses as 'I10_DX2' to 'I10_DX40' such that a patient has the possibility of having 39 separate secondary diagnoses coded upon an encounter. These are string variables using ICD-9-CM or ICD-10-CM classification systems.

    I am attempting to code comorbidities based on these 39 potential secondary diagnoses to capture all patients who may have this comorbidity captured during their hospital encounters. With this, I am attempting to use "foreach loops" to minimize the amount of code necessary to generate these variables. I have a working code that indicates I have generated a variable, but I am experiencing challenges generating a table or using the 'tab' command to view the number of patients with these comorbidities.

    As an example, I am looking to generate "previous myocardial infarction" as a comorbidity, based on the ICD-10-CM classification system, using the code below.

    Code:
    foreach x of varlist I10_DX2-I10_DX40 { 2. gen `x'_test_prevMI = 0 3. replace `x'_test_prevMI = 1 if `x'== "I252" 4. }
    Stata indicates that the variable is generated without an error message. However, when I attempt to generate descriptive statistics using the tab function within the foreach loop, I receive 39 table outputs for I10_DX2_test_prevMI to I10_DX40_test_prevMI instead of just one table for `x'_test_prevMI.

    Code:
    foreach x of varlist I10_DX2-I10_DX40 { 2. tab `x'_test_prevMI 3. } Does anyone have any insight as to how I could revise my code such that I receive one descriptive statistics table for the variable "`x'_test_prevMI"? Any advice and/or guidance is welcome and greatly appreciated!

  • #2
    Your two loops could be combined

    Code:
    foreach x of varlist I10_DX2-I10_DX40 {
        gen `x'_test_prevMI = `x'== "I252"
        tab `x;_test_prevMI 
    }
    but 39 separate tables are quite a lot.

    Your code does not separate diagnosis of something other than I252 from empty strings, as any possibility but I252 is represented by 0, but perhaps you don't care.

    tabm from tab_chi on SSC may help, but I don't really have a clear picture of your overall goal

    Comment


    • #3
      Hi Nick,

      Thank you for taking the time to respond and providing this advice! I will try tabm.

      You're right, I am able to combine the loops - I am fairly new to using them, so my approach has been to test out the loop codes separately, before combining, to ensure the outputs are what I am wanting.

      I appreciate that my original question was a bit vague. To clarify, instead of having 39 separate tables or variables, I would like to have them combined into a single variable that I could move forward with exploring in univariate and bivariate analyses (such as a table and regression models using my outcome variable). I am able to achieve this using a code similar to below, but it is not as efficient as using a loop (especially when I need to input 100+ ICD codes for particular comorbidities).


      Code:
      gen test_prevMI2 = I10_DX2 == "I252" | I10_DX3 == "I252" | I10_DX4 == "I252" | I10_DX5 == "I252" | I10_DX6 == "I252" | I10_DX7 == "I252" | I10_DX8 == "I252" | I10_DX9 == "I252" | I10_DX10 == "I252" | I10_DX11 == "I252" | I10_DX12 == "I252" | I10_DX13 == "I252" | I10_DX14 == "I252" | I10_DX15 == "I252" | I10_DX16 == "I252" | I10_DX17 == "I252" | I10_DX18 == "I252" | I10_DX19 == "I252" | I10_DX20 == "I252" | I10_DX21 == "I252" | I10_DX22 == "I252" | I10_DX23 == "I252" | I10_DX24 == "I252" | I10_DX25 == "I252" | I10_DX26 == "I252" | I10_DX27 == "I252" | I10_DX28 == "I252" | I10_DX29 == "I252" | I10_DX30 == "I252" | I10_DX31 == "I252" | I10_DX32 == "I252" | I10_DX33 == "I252" | I10_DX34 == "I252" | I10_DX35 == "I252" | I10_DX36 == "I252" | I10_DX37 == "I252" | I10_DX38 == "I252" | I10_DX39 == "I252" | I10_DX40 == "I252"


      tab test_prevMI2




      I hope this helps clarify my question and what I am hoping to achieve with the foreach loops.

      Again, I appreciate any advice and guidance on this matter!

      Comment


      • #4
        Code:
        gen wanted=0
        foreach var of varlist I10_DX*{
            replace wanted= 1 if `var'=="1252" & !wanted
        }

        If all entries of the variables I10_DX* are numbers, destring and use the egen function -anymatch()-

        Code:
        foreach var of varlist I10_DX*{
            destring `var', replace
        }
        egen wanted = anymatch(I10_DX*), values(1252)

        Comment


        • #5
          Hi Andrew,

          Thank you for your response! I had the opportunity to run this code and it has worked - thank you so much for your help!

          Comment

          Working...
          X