Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combination of dummy variables and return mean of the dummies

    Dear all,

    I have posted a similar topic but information I provided was not sufficient to get the answer I wanted (which is my fault).

    Code:
    clear
    input symptomA symptomB symptomC symptomD symptomE symptomF symptomG QOL
    0 0 0 0 0 1 0 62
    0 0 0 0 0 0 1 93
    0 0 0 0 0 0 1 87
    0 0 1 0 0 0 1 82
    0 0 0 0 0 0 1 88
    0 0 0 0 0 0 0 99
    0 0 0 1 1 0 0 64
    0 0 0 0 0 0 1 78
    0 0 0 0 1 1 0 69
    0 0 0 0 0 0 1 77
    0 0 0 0 0 1 1 93
    0 0 0 0 0 0 1 81
    0 0 0 0 0 0 1 61
    0 0 0 0 0 1 1 100
    0 0 0 0 0 0 0 79
    0 0 0 0 0 0 1 41
    0 0 0 0 0 1 1 80
    1 0 0 0 0 1 1 74
    0 0 0 0 0 0 1 52
    0 0 0 0 0 0 1 89
    0 0 0 0 0 0 0 81
    0 0 0 0 0 0 0 90
    0 0 0 0 0 0 0 95
    0 1 0 0 0 1 1 28
    0 0 0 0 0 0 0 85
    0 0 0 0 0 0 0 96
    0 0 0 0 0 0 0 48
    0 0 0 0 0 0 1 61
    0 0 0 0 0 0 1 96
    0 0 0 0 0 0 1 90
    0 0 0 0 0 1 1 62
    0 0 0 0 0 1 0 72
    0 0 0 0 0 0 0 95
    0 0 0 0 0 0 1 59
    0 0 0 0 1 1 0 79
    0 0 0 0 0 1 0 93
    0 0 0 0 0 0 1 92
    0 0 0 0 0 0 0 54
    0 0 0 0 0 0 1 96
    0 0 0 0 0 0 0 81
    end
    I want to return mean of QOL (quality of life) for someone with dual symptoms. For example, If someone has symptomA and symptomB;that is, symtptomA=1 and symptomB=1, I want to get the mean of QOL for the observations all having symptomA and symptomB. Likewise, I need to return all the possible combinations of 7C2 of symptoms and their mean, showing what symptoms they have (because I need to check what kinds of symptoms generate lower QOL).

    so the result will be like

    symptomA & symptomB mean QOL: xx
    ...

    symptomF& symptomG mean QOL: xx



    afterwards, I want to get mean QOL of 7C3 as well.

    Any help will be appreciated. Thank you!

    BW
    Kim





  • #2
    Sungwook:
    do you mean something along the following lines?
    Code:
    . egen wanted=group( symptomA- symptomG)
    
    . bysort wanted: egen mean_QoL=mean(QOL)
    
    . list
    
         +------------------------------------------------------------------------------------------------------+
         | symptomA   symptomB   symptomC   symptomD   symptomE   symptomF   symptomG   QOL   wanted   mean_QoL |
         |------------------------------------------------------------------------------------------------------|
      1. |        0          0          0          0          0          0          0    54        1   82.09091 |
      2. |        0          0          0          0          0          0          0    81        1   82.09091 |
      3. |        0          0          0          0          0          0          0    96        1   82.09091 |
      4. |        0          0          0          0          0          0          0    99        1   82.09091 |
      5. |        0          0          0          0          0          0          0    81        1   82.09091 |
         |------------------------------------------------------------------------------------------------------|
      6. |        0          0          0          0          0          0          0    85        1   82.09091 |
      7. |        0          0          0          0          0          0          0    90        1   82.09091 |
      8. |        0          0          0          0          0          0          0    95        1   82.09091 |
      9. |        0          0          0          0          0          0          0    95        1   82.09091 |
     10. |        0          0          0          0          0          0          0    79        1   82.09091 |
         |------------------------------------------------------------------------------------------------------|
     11. |        0          0          0          0          0          0          0    48        1   82.09091 |
     12. |        0          0          0          0          0          0          1    61        2    77.5625 |
     13. |        0          0          0          0          0          0          1    89        2    77.5625 |
     14. |        0          0          0          0          0          0          1    96        2    77.5625 |
     15. |        0          0          0          0          0          0          1    41        2    77.5625 |
         |------------------------------------------------------------------------------------------------------|
     16. |        0          0          0          0          0          0          1    87        2    77.5625 |
     17. |        0          0          0          0          0          0          1    77        2    77.5625 |
     18. |        0          0          0          0          0          0          1    88        2    77.5625 |
     19. |        0          0          0          0          0          0          1    93        2    77.5625 |
     20. |        0          0          0          0          0          0          1    61        2    77.5625 |
         |------------------------------------------------------------------------------------------------------|
     21. |        0          0          0          0          0          0          1    81        2    77.5625 |
     22. |        0          0          0          0          0          0          1    78        2    77.5625 |
     23. |        0          0          0          0          0          0          1    59        2    77.5625 |
     24. |        0          0          0          0          0          0          1    96        2    77.5625 |
     25. |        0          0          0          0          0          0          1    52        2    77.5625 |
         |------------------------------------------------------------------------------------------------------|
     26. |        0          0          0          0          0          0          1    90        2    77.5625 |
     27. |        0          0          0          0          0          0          1    92        2    77.5625 |
     28. |        0          0          0          0          0          1          0    62        3   75.66666 |
     29. |        0          0          0          0          0          1          0    93        3   75.66666 |
     30. |        0          0          0          0          0          1          0    72        3   75.66666 |
         |------------------------------------------------------------------------------------------------------|
     31. |        0          0          0          0          0          1          1    93        4      83.75 |
     32. |        0          0          0          0          0          1          1    80        4      83.75 |
     33. |        0          0          0          0          0          1          1    62        4      83.75 |
     34. |        0          0          0          0          0          1          1   100        4      83.75 |
     35. |        0          0          0          0          1          1          0    79        5         74 |
         |------------------------------------------------------------------------------------------------------|
     36. |        0          0          0          0          1          1          0    69        5         74 |
     37. |        0          0          0          1          1          0          0    64        6         64 |
     38. |        0          0          1          0          0          0          1    82        7         82 |
     39. |        0          1          0          0          0          1          1    28        8         28 |
     40. |        1          0          0          0          0          1          1    74        9         74 |
         +------------------------------------------------------------------------------------------------------+
    
    .
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Hello Carlo,

      Thanks for your reply. It is similar but I want to display which combination shows which QoL - so for mean QoL of 83.75 from line 31 to line 34, I want to display "symptom F & G" together. Likewise, from line 35 to 36, I want to display "symptom E & F".

      Anyway, thank you very much for your help!
      BW
      Kim

      Comment


      • #4
        Please consider this token code and result:

        Code:
        clear
        input symptomA symptomB symptomC symptomD symptomE symptomF symptomG QOL
        0 0 0 0 0 1 0 62
        0 0 0 0 0 0 1 93
        0 0 0 0 0 0 1 87
        0 0 1 0 0 0 1 82
        0 0 0 0 0 0 1 88
        0 0 0 0 0 0 0 99
        0 0 0 1 1 0 0 64
        0 0 0 0 0 0 1 78
        0 0 0 0 1 1 0 69
        0 0 0 0 0 0 1 77
        0 0 0 0 0 1 1 93
        0 0 0 0 0 0 1 81
        0 0 0 0 0 0 1 61
        0 0 0 0 0 1 1 100
        0 0 0 0 0 0 0 79
        0 0 0 0 0 0 1 41
        0 0 0 0 0 1 1 80
        1 0 0 0 0 1 1 74
        0 0 0 0 0 0 1 52
        0 0 0 0 0 0 1 89
        0 0 0 0 0 0 0 81
        0 0 0 0 0 0 0 90
        0 0 0 0 0 0 0 95
        0 1 0 0 0 1 1 28
        0 0 0 0 0 0 0 85
        0 0 0 0 0 0 0 96
        0 0 0 0 0 0 0 48
        0 0 0 0 0 0 1 61
        0 0 0 0 0 0 1 96
        0 0 0 0 0 0 1 90
        0 0 0 0 0 1 1 62
        0 0 0 0 0 1 0 72
        0 0 0 0 0 0 0 95
        0 0 0 0 0 0 1 59
        0 0 0 0 1 1 0 79
        0 0 0 0 0 1 0 93
        0 0 0 0 0 0 1 92
        0 0 0 0 0 0 0 54
        0 0 0 0 0 0 1 96
        0 0 0 0 0 0 0 81
        end
        
        gen which = ""
        
        foreach s in A B C D E F G {
            replace which = which + "`s'" if symptom`s' == 1
        }
        
        replace which = "none" if which == ""
        
        tab which, su(QOL)
        
        
                    |           Summary of QOL
              which |        Mean   Std. dev.       Freq.
        ------------+------------------------------------
                AFG |          74           0           1
                BFG |          28           0           1
                 CG |          82           0           1
                 DE |          64           0           1
                 EF |          74   7.0710678           2
                  F |   75.666667   15.821926           3
                 FG |       83.75   16.700798           4
                  G |     77.5625   17.331931          16
               none |   82.090909   16.872786          11
        ------------+------------------------------------
              Total |       77.55   17.414921          40
        The number of symptoms is the row total of your indicators.

        See also upsetplot and vennbar as discussed in

        SJ-24-2 gr0095 . . . The joy of sets: Graphical alt. to Euler & Venn diagrams
        . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and T. P. Morris
        (help upsetplot, vennbar, sortmean if installed)
        Q2/24 SJ 24(2):329--361
        introduces graphical alternatives for Euler or Venn diagrams
        mapped to bar or dot charts



        Last edited by Nick Cox; 10 Oct 2024, 07:54.

        Comment


        • #5
          Code:
          clear
          input symptomA symptomB symptomC symptomD symptomE symptomF symptomG QOL
          0 0 0 0 0 1 0 62
          0 0 0 0 0 0 1 93
          0 0 0 0 0 0 1 87
          0 0 1 0 0 0 1 82
          0 0 0 0 0 0 1 88
          0 0 0 0 0 0 0 99
          0 0 0 1 1 0 0 64
          0 0 0 0 0 0 1 78
          0 0 0 0 1 1 0 69
          0 0 0 0 0 0 1 77
          0 0 0 0 0 1 1 93
          0 0 0 0 0 0 1 81
          0 0 0 0 0 0 1 61
          0 0 0 0 0 1 1 100
          0 0 0 0 0 0 0 79
          0 0 0 0 0 0 1 41
          0 0 0 0 0 1 1 80
          1 0 0 0 0 1 1 74
          0 0 0 0 0 0 1 52
          0 0 0 0 0 0 1 89
          0 0 0 0 0 0 0 81
          0 0 0 0 0 0 0 90
          0 0 0 0 0 0 0 95
          0 1 0 0 0 1 1 28
          0 0 0 0 0 0 0 85
          0 0 0 0 0 0 0 96
          0 0 0 0 0 0 0 48
          0 0 0 0 0 0 1 61
          0 0 0 0 0 0 1 96
          0 0 0 0 0 0 1 90
          0 0 0 0 0 1 1 62
          0 0 0 0 0 1 0 72
          0 0 0 0 0 0 0 95
          0 0 0 0 0 0 1 59
          0 0 0 0 1 1 0 79
          0 0 0 0 0 1 0 93
          0 0 0 0 0 0 1 92
          0 0 0 0 0 0 0 54
          0 0 0 0 0 0 1 96
          0 0 0 0 0 0 0 81
          end
          
          // who has two symptoms?
          egen nsympt = rowtotal(symptom?)
          gen byte touse = (nsympt == 2)
          
          // combinations (with labels)
          local sympt "A B C D E F G"
          foreach s of local sympt {
              label define `s' 1 "`s'"
              label value symptom`s' `s'
          }
          
          egen gr = group(symptom?) if touse, label
          
          // display the means
          table (gr) , stat(freq) stat(mean QOL) nototal
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            Dear Nick
            Thanks for your reply. It works perfectly! RE combination, this is a subset of my original data. The data contains 3 or 4 symptoms with over 1000 obs.

            Thank you again!

            BW
            Kim

            Comment


            • #7
              Dear Maarten,

              thanks for your reply. Your solution is great and works perfectly!

              BW
              kim

              Comment

              Working...
              X