Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting a 10-digit string variable into DOB and birthday

    I am very new to Stata and data management. I have already used the help command in Stata and did Google searches but remain confused. Is there a standard list of Stata commands or a downloadable program for converting the string variable of a participant's 10-digit CPR (civil registration number) into a usable 6-digit "DMY" date of birth, and a 4-digit "DM" birthday. With the Danish CPR#, the first two digits (1-2) from left to right are the day of birth, the next two digits (3-4) are the month of birth and the next two digits (5-6) are the year of birth. The last 4 digits (7-10) of the CPR# I do not need to work with.

  • #2
    start with
    Code:
    help datetime##s3
    for your second question, which is not completely clear to me, see
    Code:
    help datetime##s6

    Comment


    • #3
      Thank you Rich for at least trying to help me. I entered "help datetime##s3" and "help datetime ##s6" and both of those commands generated a "r(601)" error. So file not found. Then I entered in "help datetime" but found the guidance to be very confusing. So I am still trying to figure out how to take a 10-digit string variable (a Danish CPR#) and convert the first 6 digits into "DMY" (a 2-digit date of birth, a 2 digit month of birth, and 2 digit year of birth). I am still struggling to figure out the code in Stata that converts the first 4 digits of a string variable (a Danish CPR#) into a 4-digit "DM" (2-digit date of birth and 2-digit month of birth) birthday.

      Comment


      • #4
        Here is example code that should start you on your way. There are ways that use fewer lines of code, but I think you will learn some much needed technique by seeing the step-by-step approach. Note that anyone over 100 years old will have the wrong birth year, since only the last two digits are given and someone born in 1920 cannot be distinguished from someone born in 2020.
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str10 cpr
        "1110500000"
        "0203040000"
        end
        generate d = real(substr(cpr,1,2))
        generate m = real(substr(cpr,3,2))
        generate y = real(substr(cpr,5,2))
        replace y = y + cond(y<=21,2000,1900)
        generate day = mdy(m,d,y)
        format %td day
        generate mon = ym(y,m)
        format %tm mon
        list, clean noobs
        Code:
        . list, clean noobs
        
                   cpr    d    m      y         day       mon  
            1110500000   11   10   1950   11oct1950   1950m10  
            0203040000    2    3   2004   02mar2004    2004m3
        Stata's "date and time" variables are complicated and there is a lot to learn. You should begin by reading the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

        And since you say you are new to Stata, I will give you the advice I give all who identify as new to Stata.

        I'm sympathetic to you as a new user of Stata - there is quite a lot to absorb. And even worse if perhaps you are under pressure to produce some output quickly. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

        When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

        All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata - I'm still far from that goal - as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

        Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

        Once you've done your reading, you'll be able to understand the more direct code.
        Code:
        input str10 cpr
        "1110500000"
        "0203040000"
        end
        generate day = daily(substr(cpr,1,6),"DMY",2021)
        format %td day
        generate mon = mofd(day)
        format %tm mon
        list, clean noobs
        Code:
        . list, clean noobs
        
                   cpr         day       mon  
            1110500000   11oct1950   1950m10  
            0203040000   02mar2004    2004m3
        Last edited by William Lisowski; 13 Jun 2021, 11:39.

        Comment

        Working...
        X