Hi Statalist Community,
I'd like to calculate the difference in days between the start and end of an individual's contracts count. However, I am particularly interested in the period between 1 January 2021 and 31 July 2023, which means that I am not interested in knowing whether an individual was already in my sample before this period or not.
Here is a small -dataex- :
The variables -id-, -idcontrato- represent respectively the ID from the client, and the Contract Number (or ID).
Can someone please help me?
Thanks in advance!
Michael
I'd like to calculate the difference in days between the start and end of an individual's contracts count. However, I am particularly interested in the period between 1 January 2021 and 31 July 2023, which means that I am not interested in knowing whether an individual was already in my sample before this period or not.
Here is a small -dataex- :
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input long id double(date_contract_start date_contract_end) long idcontrato
1001 18887 21700 1001
1001 21701 22431 451697
1001 22432 22645 1236132
1001 22646 22676 1730454
1001 22677 22735 2082075
1001 22736 23010 2172904
1001 23011 23069 2872183
1001 23070 . 3107888
1005 18800 21639 1005
1005 21640 21651 420392
1005 21652 22066 432684
1005 22067 22431 720923
1005 22432 22456 1124767
1005 22457 22645 1288758
1005 22646 22676 1742918
1005 22677 22735 2036693
1005 22736 22888 2322897
1005 22889 23010 2598018
1005 23011 23041 2728124
1005 23042 23130 2991589
1005 23131 . 3215923
1006 18800 21639 1006
1006 21640 21651 420660
1006 21652 22066 432754
1006 22067 22431 716115
end
format %td date_contract_start
format %td date_contract_end
- Warning: the trickiest thing is that some contracts started before 1 January 2021, but continue through my interest period. I want to count only the days from 1 January 2021 up to the last contract date that IDs have, but I don't know how to do it.
- A good example above is ID "1005", which has a contract that starts on 1 June 2020 but continues until 31 May 2021, to be renewed thereafter.
- Also, some of my ids have missing values if the contract doesn't end (if the -date_contract_end- variable is missing, this means that the contract is still in force).
- Then, I also want to create a variable that denotes the total number of days between 1 January 2021 and 31 July 2023, please.
Code:
* Sort the data by id and date_contract_start sort id date_contract_start * Calculate the duration variable in days for the first and last contracts by id: gen first_contract_date = date_contract_start[1] by id: gen last_contract_date = date_contract_end[_N] gen total_duration_days = last_contract_date - first_contract_date
Can someone please help me?
Thanks in advance!
Michael
Comment