Dear forum members,
I have a large administrative dataset with observations of offences, clustered within arrests (CUSTODYNUMBER), clustered within people (PERSONID). Many PERSONIDs have multiple entries (a PERSONID can have a number of observations under the same CUSTODYNUMBER if their arrest is for numerous offences on the same data, but also distinct CUSTODYNUMBERs if they are arrested again on a different date). The date of the CUSTODYNUMBER is found in a variable (EarliestDisposalDate_). I would like to identify PERSONID's that reappear if the following conditions occur and create a new variable (labelled 'REOFFENDS ON BAIL')
No if:
- The person (PERSONID) is released on bail (EARLIESTDISPOSALCorrected ==1) but does not reappear again later in the dataset with a distinct CUSTODYNUMBER.
Yes if:
- The person (PERSONID) is released on bail (EARLIESTDISPOSALCorrected ==1) and that PERSONID appears again in the dataset with a distinct CUSTODYNUMBER later in time (so a later EarliestDisposalDate_).
I'm struggling to write a code that achieves this. I have used the following but it's not producing accurate findings (I'm finding observations in the dataset that satisfy the conditions but aren't being flagged):
I have a large administrative dataset with observations of offences, clustered within arrests (CUSTODYNUMBER), clustered within people (PERSONID). Many PERSONIDs have multiple entries (a PERSONID can have a number of observations under the same CUSTODYNUMBER if their arrest is for numerous offences on the same data, but also distinct CUSTODYNUMBERs if they are arrested again on a different date). The date of the CUSTODYNUMBER is found in a variable (EarliestDisposalDate_). I would like to identify PERSONID's that reappear if the following conditions occur and create a new variable (labelled 'REOFFENDS ON BAIL')
No if:
- The person (PERSONID) is released on bail (EARLIESTDISPOSALCorrected ==1) but does not reappear again later in the dataset with a distinct CUSTODYNUMBER.
Yes if:
- The person (PERSONID) is released on bail (EARLIESTDISPOSALCorrected ==1) and that PERSONID appears again in the dataset with a distinct CUSTODYNUMBER later in time (so a later EarliestDisposalDate_).
I'm struggling to write a code that achieves this. I have used the following but it's not producing accurate findings (I'm finding observations in the dataset that satisfy the conditions but aren't being flagged):
And here is dataex example summary of data (relevant variables only, PERSONID and CUSTODYNUMBER anonymised)
sort PERSONID EarliestDisposalDate_
gen first_arrest_date = .
gen first_custody_number = ""
bysort PERSONID (EarliestDisposalDate_): replace first_arrest_date = EarliestDisposalDate_ if _n == 1
bysort PERSONID (EarliestDisposalDate_): replace first_custody_number = CUSTODYNUMBER if _n == 1
gen released_on_bail = 0
bysort PERSONID (EarliestDisposalDate_): replace released_on_bail = EARLIESTDISPOSALCorrected if _n == 1
gen reoffends_on_bail = 0 // Default: No reoffense
bysort PERSONID (EarliestDisposalDate_): replace reoffends_on_bail = 1 if released_on_bail == 1 & CUSTODYNUMBER != first_custody_number & EarliestDisposalDate_ > first_arrest_date
bysort PERSONID (EarliestDisposalDate_): replace reoffends_on_bail = max(reoffends_on_bail)
label var reoffends_on_bail "1 = Reoffends on Bail, 0 = No reoffense after bail"
input str8 PERSONID str10 CUSTODYNUMBER long EARLIESTDISPOSALCorrected double EarliestDisposalDate_Any help gratefully received!
"[12345]" "678910" 2 21650
"12345" "678910" 2 21650
"12345" "678910" 2 21650
"12345" "678910" 2 21650
"12345" "678910" 2 21650
"12345" "10111213" 2 21680
"12345" "10111213" 2 21680
"12345" "10111213" 2 21680
"12345" "10111213" 2 21680
"12345" "10111213" 2 21688
end
format %td EarliestDisposalDate_
label values EARLIESTDISPOSALCorrected FIXEDEARLYDIS
label def FIXEDEARLYDIS 1 "Pre-ChargeBail", modify
label def FIXEDEARLYDIS 2 "ReleaseUnderInvestigation", modify
label def FIXEDEARLYDIS 3 "Charge", modify
label def FIXEDEARLYDIS 5 "NoFurtherAction", modify
label def FIXEDEARLYDIS 6 "NoChargeDecision/Misc", modify
[/CODE]
Comment