Hello everyone,
I have an administrative dataset with a variable containing durations a person spends in detention. The detention can be for up to 72 hours. There are ~250,000 observations. I have calculated the durations in Excel (HH:MM:SS). I am struggling to import the Excel durations into a manner that makes sense in Stata. This specific issue is how to represent periods beyond 24:00:00.
I understand, at a very basic level, this is because, as Nick Cox has explained in a post some years ago "[d]urations are often best not regarded as date-times at all", in part because of Stata's anchoring of durations around 1 January 1960 and also because Stata works like a 24 hour clock.
Here's how a snapshot of the data looks in Stata on import from Excel. My aim is to get these durations converted and formatted in a manner Stata make sense of so, ultimatey, I can construct a model that predicts how DURATIONVARIABLE relates to various independent variables in the dataset.
* Example generated by -dataex-. For more info, type help dataex
clear
input str10 CASENUMBER str9 DURATIONVARIABLE
"1536866819" "2:02:10"
"2858545465" "7:56:07"
"2495167939" "12:09:29"
"1376276231" "5:56:43"
"4398713795" "13:24:53"
"1983011019" "32:39:34"
"1095268814" "5:28:22"
"39381691" "12:52:29"
"1562593376" "12:09:42"
"3009819123" "8:29:43"
"1950424550" "23:08:02"
"3548316279" "5:29:01"
"3384273420" "19:53:32"
"2498842354" "35:30:55"
"4852829451" "5:13:02"
"75308479" "34:05:44"
"281657202" "21:25:52"
"1589321058" "14:19:53"
"5081152858" "19:49:34"
"4563397312" "12:15:28"
end
[/CODE]
Any thoughts greatly appreciated!
Richard
I have an administrative dataset with a variable containing durations a person spends in detention. The detention can be for up to 72 hours. There are ~250,000 observations. I have calculated the durations in Excel (HH:MM:SS). I am struggling to import the Excel durations into a manner that makes sense in Stata. This specific issue is how to represent periods beyond 24:00:00.
I understand, at a very basic level, this is because, as Nick Cox has explained in a post some years ago "[d]urations are often best not regarded as date-times at all", in part because of Stata's anchoring of durations around 1 January 1960 and also because Stata works like a 24 hour clock.
Here's how a snapshot of the data looks in Stata on import from Excel. My aim is to get these durations converted and formatted in a manner Stata make sense of so, ultimatey, I can construct a model that predicts how DURATIONVARIABLE relates to various independent variables in the dataset.
* Example generated by -dataex-. For more info, type help dataex
clear
input str10 CASENUMBER str9 DURATIONVARIABLE
"1536866819" "2:02:10"
"2858545465" "7:56:07"
"2495167939" "12:09:29"
"1376276231" "5:56:43"
"4398713795" "13:24:53"
"1983011019" "32:39:34"
"1095268814" "5:28:22"
"39381691" "12:52:29"
"1562593376" "12:09:42"
"3009819123" "8:29:43"
"1950424550" "23:08:02"
"3548316279" "5:29:01"
"3384273420" "19:53:32"
"2498842354" "35:30:55"
"4852829451" "5:13:02"
"75308479" "34:05:44"
"281657202" "21:25:52"
"1589321058" "14:19:53"
"5081152858" "19:49:34"
"4563397312" "12:15:28"
end
[/CODE]
Any thoughts greatly appreciated!
Richard
Comment