Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Approximate date of birth, based on survey date and age

    I would like to generate a variable that approximates the day of birth based on the available survey data.
    I have a continuous variable for age at the moment when the survey has been conducted, I have year and month of birth, I have the year of the survey, and I have three exact dates referring to three moments of the survey (i.e. when the survey started, when the survey ended, and the middle of the survey).
    Below you can see a subsample of the dataset.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float age double(yearbirth monthbirth Survey_Year Survey_Start Survey_End Survey_Middle)
    15.660274 1995  9 2011 18724 18763 18743
    15.928767 1991  4 2007 17198 17256 17227
    16.019178 1991  4 2007 17258 17262 17260
    16.019178 1991  4 2007 17258 17262 17260
    16.279451 1995  1 2011 18721 18732 18726
     15.59726 1995  5 2011 18567 18627 18597
    15.517808 1991 11 2007 17252 17331 17291
    15.838356 1995  5 2011 18672 18699 18685
     15.59452 1991  8 2007 17198 17256 17227
    15.635616 1995  8 2011 18659 18747 18703
    15.750685 1991  8 2007 17272 17297 17284
     15.69315 1995  8 2011 18714 18735 18724
    15.673972 1991  9 2007 17257 17317 17287
     15.91233 1995  6 2011 18724 18763 18743
    15.671233 1995  7 2011 18672 18699 18685
    15.517808 1991 10 2007 17258 17262 17260
     15.50685 1991 11 2007 17257 17317 17287
    15.630137 1992  9 2007 17592 17683 17637
    15.421918 1995 12 2011 18721 18773 18747
     15.48767 1995 12 2011 18764 18778 18771
    end
    format %td Survey_Start
    format %td survey_End
    format %td survey_Middle
    label values yearbirth C02a
    label values monthbirth OC02b
    label def OC02b 1 "January", modify
    label def OC02b 4 "April", modify
    label def OC02b 5 "May", modify
    label def OC02b 6 "June", modify
    label def OC02b 7 "July", modify
    label def OC02b 8 "August", modify
    label def OC02b 9 "September", modify
    label def OC02b 10 "October", modify
    label def OC02b 11 "November", modify
    label def OC02b 12 "December", modify
    How would you proceed to estimate the day of birth?

    [Please, note that everything else in the dataset is anonymized, so an estimate of the day of birth would not allow the identification of a person; plus, without the exact day of birth, identification would not be possible anyway]

  • #2
    Code:
    gen dob= floor(Survey_Middle- age*365.25)

    You can use the age() function recently introduced to confirm the above:

    Code:
    gen age2= age_frac( dob , Survey_Middle)
    assert age==age2

    Comment


    • #3
      Many thanks Andrew, I am going to try your suggested solution!

      Comment

      Working...
      X