Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating the age difference

    Good day,



    I have a panel data set and I need assistance with calculating the age difference of couples in a family. I started by generating new variables of married male and married females. After that I generated another variable that subtracts the two. I did that using egen agediff = mmale - mfemale. However, the agediff variable is empty. I am not sure why that is the case.

    I was wondering if I can be able to get this difference within the household using the household ID to ensure that I am using the ages of people in the same household? The sample of my data is below.

    [CODE]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long hhid float(mmale mfemale)
    103034 . .
    220050 . .
    304902 . .
    405085 . .
    512245 . .
    103037 . .
    210778 . .
    302440 . .
    412925 . .
    513525 . .
    103037 . .
    210778 . .
    302440 . .
    412925 . .
    513525 . .
    103037 . .
    210778 . .
    302440 . .
    412925 . .
    513525 . .
    103050 . 49
    220052 . .
    305851 . .
    414998 . .
    506667 . .
    103046 . .
    220051 . .
    301578 . .
    411987 . .
    501754 . .
    103046 . .
    220051 . .
    301578 34 .
    411001 . .
    509386 . .
    103049 . .
    220494 . .
    308713 . .
    407658 . .
    512342 . .
    103049 . .
    220494 . .
    308713 . .
    407658 . .
    512342 . .
    103049 84 .
    220494 . .
    308713 . .
    407658 . .
    512342 . .
    103034 . .
    220050 . .
    304902 . .
    405085 . .
    512245 . .
    107090 . .
    212791 . .
    305588 . .
    415006 . .
    503184 . .
    103034 . .
    220050 . .
    304902 . .
    405085 . .
    512245 . .
    103045 48 .
    210781 48 .
    308048 . .
    411782 . .
    500609 . .
    103052 71 .
    210782 71 .
    304048 71 .
    405398 . .
    . . .
    103059 . .
    210784 . .
    307030 . .
    400945 . .
    510137 . .
    103046 . .
    220051 . .
    301578 . .
    411987 . .
    501754 . .
    103045 . 40
    210781 . .
    308048 . .
    411782 . .
    500609 . .
    103058 . .
    210783 . .
    308285 . .
    411829 . .
    503476 . .
    103058 . .
    210783 . .
    308285 . .
    411829 . .
    503476 . .

    Thanking you in advance!

  • #2
    In your example mmale is always missing when mfemale is not, and vice versa, so the difference is always missing.

    More subtly, I see just one household with a male age and a female age:


    Code:
      +--------------------------+
      |   hhid   mmale   mfemale |
      |--------------------------|
      | 103045      48         . |
      | 103045       .        40 |
      +--------------------------+
    My guess here is that one of those observations is for a male and the other for a female. It's still true that the difference will be calculated as missing, for the same reason: nothing in your code instructs Stata to do anything about any values in any other observation than each considered in turn.

    You may want something more like this

    Code:
    bysort hhid: egen malemin = min(mmale)
    by hhid: egen malemax = max(mmale)
    by hhid: egen femalemin = min(mfemale)
    by hhid: egen femalemax = max(mfemale)
    
    gen wanted = malemin - femalemin if (malemin == malemax) & (femalemin == femalemax)
    which will return a non-missing difference if (and only if) there is one (distinct) non-missing male age and one (distinct) non-missing female age in a household.

    Comment


    • #3
      Dear Nick,

      Thank you so much for the response. The code does exactly what I need. I really appreciate it.

      Regards,
      Phindile

      Comment

      Working...
      X