Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate time variable for survival analysis with panel data

    Dear Statalisters,


    I want to study the impact of a migrants language proficiency and best friends origin on how long it takes to get the first job after migration. I'm using Stata 14.2, I have panel data from two waves and want to do survival analysis using Cox regression.

    My variables (among others) are:
    ACT – employment status
    WK_RC – if respondents ever worked in recieving country (RC) after migration
    IMDATE_op – date of migration, only asked in wave 1
    JBSTART_RC_op – date of job start in RC, only asked in wave 1
    CURRJBSTART_op – date of job start in RC, only asked in wave 2
    SAMEJB – if the job reported in CURRJBSTART_op is the same as in JBSTART_RC_op, only asked in wave 2
    FR1CB – background of best friend
    LRCSPK – RC language proficiency

    What I did so far:
    use datawave1
    append using datawave2
    sort ID wave


    Then I excluded persons who dropped out in wave 2:
    egen occurences=count(_n), by(ID)
    drop if occurences < 2


    Because ID was a string, i created a new identifier variable:
    egen id= group(ID)
    list id



    Now the data looks like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    
    clear
    
    input float id byte(wave ACT WK_RC) str23(IMDATE_op JBSTART_RC_op CURRJBSTART_op) byte(SAMEJB FR1CB LRCSPK)
    
    2 1 2 2 "10/2009" "-99 (filtered)" "" . 1 3
    
    3 1 1 -99 "10/2009" "05/2010" "" . 1 2
    
    3 2 1 -99 "" "" "04/2011" 2 1 2
    
    4 1 2 2 "05/2010" "-99 (filtered)" "" . 1 3
    
    8 1 1 -99 "10/2009" "10/2009" "" . 1 4
    
    8 2 1 -99 "" "" "11/2009" 1 1 4
    
    9 1 1 -99 "10/2009" "10/2009" "" . 1 1
    
    9 2 1 -99 "" "" "11/2011" 2 1 1
    
    11 1 1 -99 "07/2010" "06/2010" "" . -99 3
    
    11 2 1 -99 "" "" "07/2010" 1 2 3
    
    12 1 1 -99 "03/2010" "04/2010" "" . 1 4
    
    12 2 1 -99 "" "" "04/2009" 1 3 3
    
    13 1 1 -99 "06/2010" "06/2010" "" . -99 3
    
    14 1 1 -99 "01/2010" "01/2010" "" . 2 2
    
    14 2 1 -99 "" "" "10/2010" 1 1 2
    
    16 1 1 -99 "04/2010" "04/2010" "" . 1 3
    
    16 2 1 -99 "" "" "12/2010" 1 2 3
    
    17 1 1 -99 "10/2009" "12/2009" "" . -99 3
    
    17 2 1 -99 "" "" "-52/2010" 1 1 2
    
    18 2 1 -99 "" "" "-99 (filtered)" -99 -99 -99
    
    20 1 1 -99 "12/2006" "02/2007" "" . 2 2
    
    20 2 1 -99 "" "" "02/2009" 1 1 2
    
    21 1 1 -99 "08/2010" "08/2010" "" . 2 2
    
    21 2 1 -99 "" "" "09/2010" 1 1 3
    
    22 1 1 -99 "08/2010" "08/2010" "" . 2 1
    
    22 2 1 -99 "" "" "08/2007" 1 2 1
    
    23 1 1 -99 "10/2009" "-99 (filtered)" "" . -99 3
    
    23 2 1 -99 "" "" "-99 (filtered)" -99 -99 -99
    
    24 1 1 -99 "09/2009" "02/2010" "" . 1 2
    
    25 1 1 -99 "10/2009" "04/2010" "" . 1 3
    
    25 2 1 -99 "" "" "10/2012" 1 1 3
    
    26 2 1 -99 "" "" "03/2012" 2 -99 1
    
    28 1 2 1 "04/2010" "09/2010" "" . -99 2
    
    28 2 1 -99 "" "" "07/2011" 2 1 2
    
    29 1 1 -99 "01/2010" "02/2010" "" . 1 3
    
    29 2 1 -99 "" "" "02/2010" 1 1 3
    
    30 1 2 2 "09/2010" "-99 (filtered)" "" . -99 2
    
    30 2 1 -99 "" "" "10/2011" 1 1 1
    
    32 1 2 2 "08/2010" "-99 (filtered)" "" . 1 2
    
    32 2 2 1 "" "" "-52/2010" 1 1 2
    
    end
    
    label values ACT Con38
    
    label def Con38 1 "working", modify
    
    label def Con38 2 "unemployed", modify
    
    label values WK_RC Con3
    
    label def Con3 -99 "filtered", modify
    
    label def Con3 1 "yes", modify
    
    label def Con3 2 "no", modify
    
    label values SAMEJB Con3_7
    
    label def Con3_7 -99 "filtered", modify
    
    label def Con3_7 1 "yes", modify
    
    label def Con3_7 2 "no", modify
    
    label values FR1CB Con4
    
    label def Con4 -99 "filtered", modify
    
    label def Con4 1 "[in CO]", modify
    
    label def Con4 2 "[RC]", modify
    
    label def Con4 3 "other", modify
    
    label values LRCSPK Con19
    
    label def Con19 -99 "filtered", modify
    
    label def Con19 1 "very well", modify
    
    label def Con19 2 "well", modify
    
    label def Con19 3 "not well", modify
    
    label def Con19 4 "not at all", modify


    Until now I didn't recode answers like „don't know“ or „refused“ as missings (.), because everytime a question was asked only in one panel wave the missing answers in the other panel wave are coded as missing (.), so I was afraid I'd mash things up if I also recoded the true missings.


    Now my questions are:

    1. How to create a time variable for survival analysis, that is, the time from date of migration to start of the first job in RC?
    I know I somehow have to combine JBSTART_RC_op and CURRJBSTART_op (and maybe even SAMEJB) before substracting IMDATE_op from it, but I don't know how to do it (especially since I got so many false „missings“ in these variables because they were only asked in one wave).

    2. How to create the failure indicator (employed: yes/no) while correctly taking into account respondents who are on maternity/paternity leave?


    Kind regards,
    Anna
Working...
X