Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting diseases

    Hi dear Statalist:

    I have a question. My dataset has 6764 subjects, from them 4142 have any disease; I have around 124 variables, and 42 variables that are each disease as dummy, for example cancer (0= no, 1=yes), arthritis (0= no, 1=yes) . The diseases are not exclusive. I would like to have a count of frequency of diseases by person. Example:

    1 disease: 500 persons
    2 diseases: 475 persons
    3 diseases: 234 persons

    Is there any way to count of the 42 diseases variables that are within the 142 variables? And to count the freqcuency of number of diseases in the dummy variables?

    Best regards,

  • #2
    If your data structure is like this
    .
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id disease1 disease2 disease3)
    1 0 0 0
    2 1 0 0
    3 1 1 0
    end
    then you just need totals across each observation


    Code:
    egen ndiseases = rowtotal(disease*)
    
    list
    
         +------------------------------------------------+
         | id   disease1   disease2   disease3   ndisea~s |
         |------------------------------------------------|
      1. |  1          0          0          0          0 |
      2. |  2          1          0          0          1 |
      3. |  3          1          1          0          2 |
         +------------------------------------------------+
    .
    Clearly you'll need to use your real variable names. If your disease variables are adjacent something like


    Code:
    egen wanted = rowtotal(cancer-colitis)
    may be enough.

    Comment


    • #3
      Dear Nick:
      I used thecommand
      egen ndiseases = rowtotal(disease*) It worked perfectly!! Thank you so much!!

      Comment

      Working...
      X