Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to identify oldest children with highest education?

    Dear all,

    I have a household data and I want to generate a variable indicating year of birth of a child with the highest educational level (condition 1) and in cases, if there are two children (or more) with the same educational level, choose the oldest one (condition 2).

    For example, take a look at pid=183 (pid is id of parents) and this individual has three children. Here, I want to create a binary variable identifying the one born in 1953 (e.g., =1 and 0 or missing otherwise) because she is older though her educational level is the same as the one born in 1962.

    Note: I think this question deserves a new thread so I created this one. My previous post can be found at: https://www.statalist.org/forums/for...r-parents-data

    Code:
    clear
    input long qid byte(csex relationship) int cyob byte cedu
     11 2 4 1970 7
     13 1 3 1975 7
     16 1 3 1971 5
    110 1 3 1959 7
    111 1 3 1977 7
    112 1 3 1977 6
    123 1 3 1988 7
    125 1 3 1982 7
    129 2 4 1985 6
    134 1 3 1975 5
    136 1 3 1963 4
    136 2 4 1967 4
    137 1 3 1976 6
    137 2 3 1980 7
    137 1 3 1983 6
    138 2 4 1969 5
    139 1 3 1955 5
    140 1 3 1978 4
    141 1 3 1970 5
    142 2 4 1959 4
    146 1 3 1976 5
    146 1 3 1978 5
    147 1 3 1957 4
    148 1 3 1982 5
    151 1 3 1975 3
    152 1 3 1986 3
    152 1 3 1992 3
    153 1 3 1955 4
    154 1 3 1977 4
    156 1 3 1957 4
    159 1 3 1962 4
    161 1 4 1968 5
    161 1 3 1963 6
    163 1 3 1956 4
    164 1 3 1973 7
    164 1 3 1979 7
    167 2 4 1978 9
    168 1 3 1973 7
    168 2 4 1975 7
    169 1 3 1959 4
    171 1 3 1962 5
    173 1 3 1970 5
    175 1 3 1980 8
    176 1 3 1979 8
    178 1 3 1974 6
    180 1 3 1990 5
    181 1 3 1985 7
    182 1 3 1957 4
    182 2 4 1962 7
    183 2 4 1953 4
    183 2 4 1949 1
    183 1 3 1962 4
    186 1 3 1955 4
    188 1 3 1964 4
    190 1 3 1971 5
    191 1 3 1984 7
    192 1 3 1971 7
    192 1 3 1977 5
    193 1 3 1964 7
    196 2 4 1956 4
    197 1 3 1981 7
    331 2 4 1993 4
    332 1 3 1968 3
    333 1 3 1973 3
    336 1 3 1975 7
    337 1 3 1965 3
    338 2 4 1967 4
    362 1 3 1963 5
    363 2 4 1977 4
    366 1 3 1960 5
    369 1 3 1949 6
    384 1 3 1975 7
    387 1 3 1977 4
    389 1 3 1975 7
    463 1 3 1979 8
    464 1 3 1973 7
    465 1 3 1966 7
    469 1 3 1981 7
    491 2 4 1983 6
    491 2 4 1991 5
    493 2 4 1958 7
    494 1 3 1982 7
    496 1 3 1973 8
    497 2 4 1983 6
    end
    label values csex LABEL_B25
    label def LABEL_B25 1 "Male", modify
    label def LABEL_B25 2 "Female", modify
    label values relationship relationship
    label def relationship 3 "Son", modify
    label def relationship 4 "Daughter", modify
    label values cedu cedu
    label def cedu 1 "No schooling", modify
    label def cedu 3 "Primary", modify
    label def cedu 4 "Lower secondary", modify
    label def cedu 5 " Upper secondary", modify
    label def cedu 6 "Prof secondary education", modify
    label def cedu 7 "Junior college/University", modify
    label def cedu 8 "Master", modify
    label def cedu 9 "PhD", modify
    Thank you.

  • #2
    Your request is a little confusing. In the first paragraph you say you want a variable giving the year of birth of the oldest child with the most education. But in the second paragraph you say you want an indicator for that child. Maybe you want both?

    Assuming you have no missing values in the cedu variable, this will do it:

    Code:
    gen long obs_no = _n // MARK THE SORT ORDER
    
    assert !missing(cedu)
    gen byte is_child = inlist(relationship, 3, 4)
    gsort qid is_child cedu -cyob
    by qid: gen wanted_indicator = (_n == _N) if is_child[_N]
    by qid: gen wanted_yob = cyob[_N] if is_child[_N]
    
    sort obs_no // RESTORE ORIGINAL SORT ORDER
    Note: if there are two (or more) children born the same year who both have the highest education in their household, one is picked arbitrarily and irreproducibly for the indicator. If that is not acceptable, you need to state what the rule for breaking such ties is.

    Comment


    • #3
      Dear Professor Clyde,

      Thank you for your swift help. I appreciate that. Also, I am sorry for the confusion in #1. Your code works, however, what I wanted to have is that I want to identify the child with the most education in each household but as you can see there may be two or more children with the same educational level in a household. In this case, I want to pick up the oldest child. I hope that there will be no confusion here.

      Note: if there are two (or more) children born the same year who both have the highest education in their household, one is picked arbitrarily and irreproducibly for the indicator
      It is fine.

      Thank you.
      Last edited by Matthew Williams; 31 May 2021, 21:05.

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        Your request is a little confusing. In the first paragraph you say you want a variable giving the year of birth of the oldest child with the most education. But in the second paragraph you say you want an indicator for that child. Maybe you want both?

        Assuming you have no missing values in the cedu variable, this will do it:

        Code:
        gen long obs_no = _n // MARK THE SORT ORDER
        
        assert !missing(cedu)
        gen byte is_child = inlist(relationship, 3, 4)
        gsort qid is_child cedu -cyob
        by qid: gen wanted_indicator = (_n == _N) if is_child[_N]
        by qid: gen wanted_yob = cyob[_N] if is_child[_N]
        
        sort obs_no // RESTORE ORIGINAL SORT ORDER
        Note: if there are two (or more) children born the same year who both have the highest education in their household, one is picked arbitrarily and irreproducibly for the indicator. If that is not acceptable, you need to state what the rule for breaking such ties is.
        Thanks a lot for the explanation.

        Comment


        • #5
          Originally posted by Matthew Williams View Post
          Dear all,

          I have a household data and I want to generate a variable indicating year of birth of a child with the highest educational level (condition 1) and in cases, if there are two children (or more) with the same educational level, choose the oldest one (condition 2).

          For example, take a look at pid=183 (pid is id of parents) and this individual has three children. Here, I want to create a binary variable identifying the one born in 1953 (e.g., =1 and 0 or missing otherwise) because she is older though her educational level is the same as the one born in 1962.

          Note: I think this question deserves a new thread so I created this one. My previous post can be found at: https://www.statalist.org/forums/for...r-parents-data

          I started to get seriously involved in this when I was in college. For me, at first it was a form of earnings, but then it grew into a vocation. I often read essays on Writingbros on a variety of topics, including life, life choices, and so on.

          Code:
          clear
          input long qid byte(csex relationship) int cyob byte cedu
          11 2 4 1970 7
          13 1 3 1975 7
          16 1 3 1971 5
          110 1 3 1959 7
          111 1 3 1977 7
          112 1 3 1977 6
          123 1 3 1988 7
          125 1 3 1982 7
          129 2 4 1985 6
          134 1 3 1975 5
          136 1 3 1963 4
          136 2 4 1967 4
          137 1 3 1976 6
          137 2 3 1980 7
          137 1 3 1983 6
          138 2 4 1969 5
          139 1 3 1955 5
          140 1 3 1978 4
          141 1 3 1970 5
          142 2 4 1959 4
          146 1 3 1976 5
          146 1 3 1978 5
          147 1 3 1957 4
          148 1 3 1982 5
          151 1 3 1975 3
          152 1 3 1986 3
          152 1 3 1992 3
          153 1 3 1955 4
          154 1 3 1977 4
          156 1 3 1957 4
          159 1 3 1962 4
          161 1 4 1968 5
          161 1 3 1963 6
          163 1 3 1956 4
          164 1 3 1973 7
          164 1 3 1979 7
          167 2 4 1978 9
          168 1 3 1973 7
          168 2 4 1975 7
          169 1 3 1959 4
          171 1 3 1962 5
          173 1 3 1970 5
          175 1 3 1980 8
          176 1 3 1979 8
          178 1 3 1974 6
          180 1 3 1990 5
          181 1 3 1985 7
          182 1 3 1957 4
          182 2 4 1962 7
          183 2 4 1953 4
          183 2 4 1949 1
          183 1 3 1962 4
          186 1 3 1955 4
          188 1 3 1964 4
          190 1 3 1971 5
          191 1 3 1984 7
          192 1 3 1971 7
          192 1 3 1977 5
          193 1 3 1964 7
          196 2 4 1956 4
          197 1 3 1981 7
          331 2 4 1993 4
          332 1 3 1968 3
          333 1 3 1973 3
          336 1 3 1975 7
          337 1 3 1965 3
          338 2 4 1967 4
          362 1 3 1963 5
          363 2 4 1977 4
          366 1 3 1960 5
          369 1 3 1949 6
          384 1 3 1975 7
          387 1 3 1977 4
          389 1 3 1975 7
          463 1 3 1979 8
          464 1 3 1973 7
          465 1 3 1966 7
          469 1 3 1981 7
          491 2 4 1983 6
          491 2 4 1991 5
          493 2 4 1958 7
          494 1 3 1982 7
          496 1 3 1973 8
          497 2 4 1983 6
          end
          label values csex LABEL_B25
          label def LABEL_B25 1 "Male", modify
          label def LABEL_B25 2 "Female", modify
          label values relationship relationship
          label def relationship 3 "Son", modify
          label def relationship 4 "Daughter", modify
          label values cedu cedu
          label def cedu 1 "No schooling", modify
          label def cedu 3 "Primary", modify
          label def cedu 4 "Lower secondary", modify
          label def cedu 5 " Upper secondary", modify
          label def cedu 6 "Prof secondary education", modify
          label def cedu 7 "Junior college/University", modify
          label def cedu 8 "Master", modify
          label def cedu 9 "PhD", modify
          Thank you.
          I am also interested in solving the problem. Thank you for your inquiry.

          Comment

          Working...
          X