I have a long dataset that contains stays in classes of a series of schools. The data look like this:
So for example this student stayed in U16 from 1/6/16 to 5/4/16, then he/she stayed in U15 from 5/4/16 to 5/25/16. My goal is to generate moves instead of stays. So the first movement was in 5/4/2016 which is the end date of the first stay. Then I need to create a variable that show the area they were and the are the went to, in this case U16 to U15. I was able to do this by:
The result looks ok to me. The only problem is that the "to" variables such as school_to are one row before the corresponding "from" variable such as from_to and they should be in the same row. Perhaps this is not the best approach to do this. Any ideas on how to fix this issue or a new approach? One fix could be to just move the values of the "to" variables one row more (_n + 1???).
Thank you!
ID | school | area | startdt | enddt |
464 | KC | U16 | 1/6/2016 2:10 | 5/4/2016 8:32 |
464 | KC | U15 | 5/4/2016 8:32 | 5/25/2016 1:10 |
464 | KC | U16 | 5/25/2016 1:10 | 6/29/2016 10:20 |
464 | KC | U2 | 6/29/2016 10:20 | 7/2/2016 3:02 |
464 | KC | L11 | 7/2/2016 3:02 | 7/26/2016 3:30 |
Code:
bysort ID (startdt): gen MOVEMENT=enddt[_n-1] format MOVEMENT %tc bysort ID (startdt): gen school_from=school[_n-1] bysort ID (startdt)): gen area_from =area[_n-1] bysort ID (startdt)): gen school_to=school[_n+1] bysort ID (startdt)): gen area_to=area[_n+1]
ID | school | area | startdt | enddt | MOVEMENT | school_from | area_from | school_to | area_to |
464 | KC | U16 | 1/6/2016 2:10 | 5/4/2016 8:32 | KC | U15 | |||
464 | KC | U15 | 5/4/2016 8:32 | 5/25/2016 1:10 | 4-May-16 | KC | U16 | KC | U16 |
464 | KC | U16 | 5/25/2016 1:10 | 6/29/2016 10:20 | 25-May-16 | KC | U15 | KC | U2 |
464 | KC | U2 | 6/29/2016 10:20 | 7/2/2016 3:02 | 29-Jun-16 | KC | U16 | KC | L11 |
464 | KC | L11 | 7/2/2016 3:02 | 7/26/2016 3:30 | 2-Jul-16 | KC | U2 |
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int ID str2 school str3 area double startdt str16 enddt int MOVEMENT str2 school_from str3 area_from str4 school_to str8 area_to 464 "KC" "U16" 1767665400000.0002 " 5/4/2016 8:32" . "" "" "KC" "U15" 464 "KC" "U15" 1777969920000.0002 " 5/25/2016 1:09" 20578 "KC" "U16" "KC" "U16" 464 "KC" "U16" 1779757799999.9998 " 6/29/2016 10:20" 20599 "KC" "U15" "KC" "U2" 464 "KC" "U2" 1.7828148e+12 " 7/2/2016 3:01" 20634 "KC" "U16" "KC" "L11" 464 "KC" "L11" 1783047719999.9998 " 7/26/2016 3:30" 20637 "KC" "U2" "" "" 452 "DC" "6L" 1.771665e+12 " 5/22/2016 10:02" . "" "" "BKDC" "7L" 452 "DC" "7L" 1779530520000.0002 " 6/3/2016 11:00" 20596 "DC" "6L" "BKDC" "8L" 452 "DC" "8L" 1780570859999.9998 " 6/16/2016 2:38" 20608 "DC" "7L" "BKDC" "8L" 452 "DC" "8L" 1781663880000 "." 20621 "DC" "8L" "" " " end format %tcnn/dd/ccYY_hh:MM startdt format %tddd-Mon-YY MOVEMENT
Comment