Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate sum of certain observations in a group

    Hello,

    I want to calculate the sum of only certain categories' value in a group. Here is a part of the sample (I randomly chose the numbers).
    Family Person Income
    1 1 100
    1 2 110
    1 3 130
    2 1 50
    2 2 60
    2 3 70
    2 4 80
    3 1 40
    3 2 50
    In Family 1, there are three family members; in Family 2, four family members; and in Family 3, two members.

    I want to calculate the sum of Person 1 and 2's income. So, for Family 1, it should be 100+110 = 220 and for Family 2, 50+60=110.


    I want to create this below data set:
    Family Person Income Income_1_2
    1 1 100 210
    1 2 110 210
    1 3 130 210
    2 1 50 110
    2 2 60 110
    2 3 70 110
    2 4 80 110
    3 1 40 90
    3 2 50 90
    I tried to use this code, but I don't know how to make conditions.
    bys Family: egen Income_1_2 = total(Income)

    Thank you in advance.

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(family person) int income
    1 1 100
    1 2 110
    1 3 130
    2 1  50
    2 2  60
    2 3  70
    2 4  80
    3 1  40
    3 2  50
    end
    
    bys family(person): egen wanted=total(cond(inlist(person, 1,2), income, .))
    Res.:

    Code:
    . l, sepby(family)
    
         +-----------------------------------+
         | family   person   income   wanted |
         |-----------------------------------|
      1. |      1        1      100      210 |
      2. |      1        2      110      210 |
      3. |      1        3      130      210 |
         |-----------------------------------|
      4. |      2        1       50      110 |
      5. |      2        2       60      110 |
      6. |      2        3       70      110 |
      7. |      2        4       80      110 |
         |-----------------------------------|
      8. |      3        1       40       90 |
      9. |      3        2       50       90 |
         +-----------------------------------+
    
    .

    Comment


    • #3
      Thank you Andrew. That is exactly what I want.

      Comment


      • #4
        Another way would be

        Code:
        . bysort family (person): egen totinc = total(income/(_n<3))
        
        . list, sepby(family)
        
             +-----------------------------------+
             | family   person   income   totinc |
             |-----------------------------------|
          1. |      1        1      100      210 |
          2. |      1        2      110      210 |
          3. |      1        3      130      210 |
             |-----------------------------------|
          4. |      2        1       50      110 |
          5. |      2        2       60      110 |
          6. |      2        3       70      110 |
          7. |      2        4       80      110 |
             |-----------------------------------|
          8. |      3        1       40       90 |
          9. |      3        2       50       90 |
             +-----------------------------------+

        Comment


        • #5
          #4 The help for egen warns against using _n in calls to that command. A safer tweak would be one of

          Code:
          bysort family : egen totinc = total(income / (person <= 2)) 
          
          bysort family : egen totinc = total(income / inlist(person, 1, 2))
          For more on the approaches in #2 and #4 see Sections 9 and 10 respectively of https://www.stata-journal.com/articl...article=dm0055

          Comment

          Working...
          X