Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Divide siblings into younger and older pairs for each child

    I'd like to divide siblings into younger and older siblings by gender. Can anyone help me?
    I need to know how many older brothers, sisters, younger sisters, younger brothers for each child.
    Households are identified by hhid14, children are identified by pid14. Order is the birth order.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double pid14 str8 hhid14 double(sex birth age) float order
     4 "0010600" 0 1985 29  1
     8 "0010600" 0 1998 16  2
     9 "0010600" 0 2002 12  3
    10 "0010600" 1 2011  3  4
    11 "0010600" 1 2013  1  5
     3 "0010651" 0 2008  6  1
     4 "0010651" 0 2012  3  2
     3 "0010851" 0 2002 12  1
     4 "0010851" 1 2004 10  2
     5 "0010851" 1 2006  8  3
     6 "0010851" 0 2008  6  4
     7 "0010851" 1 2011  3  5
     3 "0012200" 1 1980 34  1
     4 "0012200" 0 1982 32  2
     5 "0012200" 0 1984 30  3
     6 "0012200" 1 1985 28  4
     7 "0012200" 0 1987 27  5
     8 "0012200" 1 1990 24  6
     9 "0012200" 1 1992 22  7
    10 "0012200" 0 1993 21  8
    13 "0012200" 0 1999 15  9
    14 "0012200" 1 2002 12 10
    15 "0012200" 0 2004 10 11
     3 "0012241" 0 1999 15  1
     4 "0012241" 0 2001 13  2
     5 "0012241" 1 2003 11  3
     6 "0012241" 1 2013  0  4
     3 "0012400" 1 1974 40  1
     4 "0012400" 1 1979 35  2
     5 "0012400" 0 1981 33  3
     6 "0012400" 0 1982 32  4
     7 "0012400" 0 1983 30  5
     8 "0012400" 0 1985 29  6
     9 "0012400" 0 1987 27  7
    10 "0012400" 0 1994 20  8
    14 "0012400" 0 1996 17  9
     3 "0012451" 1 1993 21  1
     4 "0012451" 0 1996 18  2
     5 "0012451" 0 2001 13  3
     6 "0012451" 0 2003 11  4
     7 "0012451" 0 2007  7  5
     9 "0012500" 1 2003 11  1
    10 "0012500" 1 2006  8  2
    11 "0012500" 0 2009  5  3
    12 "0012500" 1 2011  3  4
    13 "0012500" 0 2014  0  5
     3 "0012900" 0 1984 30  1
     4 "0012900" 1 1986 28  2
     5 "0012900" 1 1990 24  3
     6 "0012900" 0 1991 22  4
     7 "0012900" 0 1993 20  5
     8 "0012900" 1 1998 16  6
    10 "0012900" 0 2001 13  7
    11 "0012900" 1 2003 11  8
     3 "0012951" 1 2004 10  1
     4 "0012951" 1 2006  9  2
     5 "0012951" 0 2012  3  3
     3 "0012952" 1 2009  6  1
     4 "0012952" 0 2011  4  2
     5 "0012952" 0 2014  1  3
     6 "0012952" 1 2015  0  4
     5 "0012953" 1 2014  1  1
     7 "0020100" 0 1993 21  1
     8 "0020200" 0 2000 14  1
     9 "0020200" 0 2003 11  2
    10 "0020200" 1 2010  4  3
     3 "0020400" 0 1987 27  1
     4 "0020400" 0 1989 25  2
     5 "0020400" 0 1993 21  3
     6 "0020400" 0 1995 19  4
     7 "0020400" 1 1998 16  5
     8 "0020400" 1 2002 12  6
     9 "0020400" 1 2005  9  7
    11 "0020400" 0 2008  6  8
     4 "0020441" 0 2005 10  1
     5 "0020441" 1 2006  9  2
     6 "0020441" 1 2012  2  3
     5 "0020442" 1 1998 16  1
     4 "0020442" 1 2002 12  2
     7 "0020500" 1 2005 10  1
     8 "0020500" 0 2009  5  2
     9 "0020500" 0 2013  2  3
     7 "0020600" 1 1998 15  1
     8 "0020600" 0 2000 14  2
     9 "0020600" 0 2005  9  3
    10 "0020600" 1 2009  5  4
     3 "0020651" 1 2001 13  1
     4 "0020651" 1 2003 11  2
     6 "0020651" 0 2004 11  3
     5 "0020651" 1 2005  9  4
     7 "0020651" 1 2009  5  5
     8 "0020651" 0 2011  4  6
     4 "0021142" 0 1988 26  1
     3 "0021142" 1 1992 22  2
     5 "0021142" 0 2003 11  3
     3 "0021143" 1 1996 18  1
     4 "0021143" 1 1999 15  2
     5 "0021143" 1 2003 11  3
     6 "0021143" 0 2009  5  4
     3 "0021300" 0 1989 26  1
    end
    label values sex gender
    label def gender 0 "female", modify
    label def gender 1 "male", modify

  • #2
    Thanks for the nice question. I used rangestat from SSC. The number of older, younger children is just the number of those with lower, higher values of order. The number of males is just the sum of sex (which would be better called male (*)) over those sublings and you can get the number of females by subtraction.


    Code:
    clear
    input double pid14 str8 hhid14 double(sex birth age) float order
     4 "0010600" 0 1985 29  1
     8 "0010600" 0 1998 16  2
     9 "0010600" 0 2002 12  3
    10 "0010600" 1 2011  3  4
    11 "0010600" 1 2013  1  5
     3 "0010651" 0 2008  6  1
     4 "0010651" 0 2012  3  2
     3 "0010851" 0 2002 12  1
     4 "0010851" 1 2004 10  2
     5 "0010851" 1 2006  8  3
     6 "0010851" 0 2008  6  4
     7 "0010851" 1 2011  3  5
     3 "0012200" 1 1980 34  1
     4 "0012200" 0 1982 32  2
     5 "0012200" 0 1984 30  3
     6 "0012200" 1 1985 28  4
     7 "0012200" 0 1987 27  5
     8 "0012200" 1 1990 24  6
     9 "0012200" 1 1992 22  7
    10 "0012200" 0 1993 21  8
    13 "0012200" 0 1999 15  9
    14 "0012200" 1 2002 12 10
    15 "0012200" 0 2004 10 11
     3 "0012241" 0 1999 15  1
     4 "0012241" 0 2001 13  2
     5 "0012241" 1 2003 11  3
     6 "0012241" 1 2013  0  4
     3 "0012400" 1 1974 40  1
     4 "0012400" 1 1979 35  2
     5 "0012400" 0 1981 33  3
     6 "0012400" 0 1982 32  4
     7 "0012400" 0 1983 30  5
     8 "0012400" 0 1985 29  6
     9 "0012400" 0 1987 27  7
    10 "0012400" 0 1994 20  8
    14 "0012400" 0 1996 17  9
     3 "0012451" 1 1993 21  1
     4 "0012451" 0 1996 18  2
     5 "0012451" 0 2001 13  3
     6 "0012451" 0 2003 11  4
     7 "0012451" 0 2007  7  5
     9 "0012500" 1 2003 11  1
    10 "0012500" 1 2006  8  2
    11 "0012500" 0 2009  5  3
    12 "0012500" 1 2011  3  4
    13 "0012500" 0 2014  0  5
     3 "0012900" 0 1984 30  1
     4 "0012900" 1 1986 28  2
     5 "0012900" 1 1990 24  3
     6 "0012900" 0 1991 22  4
     7 "0012900" 0 1993 20  5
     8 "0012900" 1 1998 16  6
    10 "0012900" 0 2001 13  7
    11 "0012900" 1 2003 11  8
     3 "0012951" 1 2004 10  1
     4 "0012951" 1 2006  9  2
     5 "0012951" 0 2012  3  3
     3 "0012952" 1 2009  6  1
     4 "0012952" 0 2011  4  2
     5 "0012952" 0 2014  1  3
     6 "0012952" 1 2015  0  4
     5 "0012953" 1 2014  1  1
     7 "0020100" 0 1993 21  1
     8 "0020200" 0 2000 14  1
     9 "0020200" 0 2003 11  2
    10 "0020200" 1 2010  4  3
     3 "0020400" 0 1987 27  1
     4 "0020400" 0 1989 25  2
     5 "0020400" 0 1993 21  3
     6 "0020400" 0 1995 19  4
     7 "0020400" 1 1998 16  5
     8 "0020400" 1 2002 12  6
     9 "0020400" 1 2005  9  7
    11 "0020400" 0 2008  6  8
     4 "0020441" 0 2005 10  1
     5 "0020441" 1 2006  9  2
     6 "0020441" 1 2012  2  3
     5 "0020442" 1 1998 16  1
     4 "0020442" 1 2002 12  2
     7 "0020500" 1 2005 10  1
     8 "0020500" 0 2009  5  2
     9 "0020500" 0 2013  2  3
     7 "0020600" 1 1998 15  1
     8 "0020600" 0 2000 14  2
     9 "0020600" 0 2005  9  3
    10 "0020600" 1 2009  5  4
     3 "0020651" 1 2001 13  1
     4 "0020651" 1 2003 11  2
     6 "0020651" 0 2004 11  3
     5 "0020651" 1 2005  9  4
     7 "0020651" 1 2009  5  5
     8 "0020651" 0 2011  4  6
     4 "0021142" 0 1988 26  1
     3 "0021142" 1 1992 22  2
     5 "0021142" 0 2003 11  3
     3 "0021143" 1 1996 18  1
     4 "0021143" 1 1999 15  2
     5 "0021143" 1 2003 11  3
     6 "0021143" 0 2009  5  4
     3 "0021300" 0 1989 26  1
    end
    label values sex gender
    label def gender 0 "female", modify
    label def gender 1 "male", modify
    
    rangestat (count) y_count=sex (sum) y_male=sex, int(order 1 .) by(hhid14)
    rangestat (count) o_count=sex (sum) o_male=sex, int(order . -1) by(hhid14)
    
    quietly foreach v of var ?_count ?_male { 
        replace `v' = 0 if missing(`v')
    }
    
    list if hhid14 == hhid14[1]
    
        +--------------------------------------------------------------------------------------+
         | pid14    hhid14      sex   birth   age   order   y_count   y_male   o_count   o_male |
         |--------------------------------------------------------------------------------------|
      1. |     4   0010600   female    1985    29       1         4        2         0        0 |
      2. |     8   0010600   female    1998    16       2         3        2         1        0 |
      3. |     9   0010600   female    2002    12       3         2        2         2        0 |
      4. |    10   0010600     male    2011     3       4         1        1         3        0 |
      5. |    11   0010600     male    2013     1       5         0        0         4        1 |
         +--------------------------------------------------------------------------------------+
    (*) See Section 10 of https://www.stata-journal.com/articl...article=dm0099 for an (entirely unoriginal) explanation of why indicator variables are better named after the condition coded 1. This is an old story: compare foreign in the auto dataset.




    Comment


    • #3
      Thanks a lot Nick! But the the results only showed how many older brothers, younger brothers and the total number of older siblings and younger siblings. But how to calculate the number of older sisters and younger sisters?

      Comment


      • #4
        As I said, you can use subtraction.

        Code:
        gen y_female = y_count - y_male
        For example, pid 4 has 4 younger siblings, 2 male -- so 2 female

        A similar trick will get you o_female.

        Comment

        Working...
        X