Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data management - start and end dates

    Hi,

    I have a database in STATA with a similar structure:

    id start_date end_date job_type
    1 1990 1992 1
    2 1991 1992 2
    3 1992 1993 1

    And I would like to transform it like that:

    id date job_type
    1 1990 1
    1 1991 1
    1 1992 1
    1 1993 .
    2 1990 .
    2 1991 2
    2 1992 2
    2 1993 .
    3 1990 .
    3 1991 .
    3 1992 1
    3 1993 1

    I have created y_start y_end date variables but after that, I am stuck in the creation of the macro to transform the file. Anyone could help? thanks

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id int(start_date end_date) byte job_type
    1 1990 1992 1
    2 1991 1992 2
    3 1992 1994 1
    end
    
    expand end_date - start_date + 1
    by id (start_date), sort: gen date = start_date + _n - 1
    by id (date), sort: assert date[1] == start_date & date[_N] == end_date
    drop start_date end_date
    fillin id date
    In your example data, end_date is always 1 greater than start_date. If that is always true in your data, there is a slightly simpler way to do this. But in writing the above code, I have assumed that life is not that easy, and I have accordingly modified your example data so that one of the id's (3) has a longer span of time in its observation. Thus this code should work rather generally.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X