Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data Validation in Survey Timestamps

    This is the question I got from someone: "The field team was instructed to survey respondents during the hours of 9 am and 6 pm. How many instances in the sample violated this rule?" can anyone suggest the idea how to work on this query?

    I added the dataex for your reference.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int uniqueid str24(starttime endtime) byte(b1_householdmembers b4_relationshiptohh1 b5_sex1 b6_age1)
    1002 "Jan 2, 2011 9:43:04 PM"   "Jan 2, 2011 10:52:17 PM"   2 1 1 46
    1003 "Jan 1, 2011 3:15:36 AM"   "Jan 1, 2011 5:23:32 AM"   10 1 1 56
    1009 "Apr 3, 2013 10:33:29 PM"  "Apr 3, 2013 11:44:44 PM"   3 1 1 38
    1012 "Apr 5, 2013 7:21:19 PM"   "Apr 5, 2013 8:33:34 PM"    3 1 1 58
    1044 "Jan 1, 2011 3:49:44 AM"   "Jan 1, 2011 5:12:07 AM"    5 2 2 39
    1046 "Apr 10, 2013 5:33:58 PM"  "Apr 10, 2013 6:47:00 PM"   4 1 2 49
    1052 "Apr 4, 2013 5:47:45 PM"   "Apr 4, 2013 6:14:38 PM"    6 1 1 36
    1054 "Apr 4, 2013 6:56:03 PM"   "Apr 4, 2013 7:20:36 PM"    4 1 1 22
    1061 "Apr 10, 2013 7:33:17 PM"  "Apr 10, 2013 8:16:45 PM"   2 1 1 56
    1066 "Jan 3, 2011 3:00:41 AM"   "Mar 23, 2013 4:50:01 PM"   4 1 1 43
    1070 "Apr 1, 2013 10:01:56 PM"  "Apr 1, 2013 11:13:25 PM"   4 1 2 41
    1081 "Apr 10, 2013 9:17:56 PM"  "Apr 10, 2013 9:57:18 PM"   3 1 2 47
    1091 "Apr 10, 2013 10:34:10 PM" "Apr 10, 2013 11:11:48 PM"  5 1 1 52
    1119 "Apr 1, 2013 6:58:08 PM"   "Apr 1, 2013 7:28:50 PM"    2 1 2 58
    1146 "Mar 20, 2013 12:53:11 AM" "Mar 20, 2013 1:41:32 AM"   2 2 1 60
    1151 "Apr 11, 2013 3:59:31 PM"  "Apr 11, 2013 4:26:26 PM"   5 1 1 47
    1201 "Mar 22, 2013 9:26:41 PM"  "Mar 22, 2013 10:06:43 PM"  4 1 1 43
    1202 "Mar 22, 2013 10:16:18 PM" "Mar 22, 2013 11:00:54 PM"  4 1 1 30
    1203 "Mar 23, 2013 4:29:09 PM"  "Mar 23, 2013 5:12:38 PM"   6 1 1 40
    1204 "Mar 25, 2013 7:55:08 PM"  "Mar 25, 2013 8:27:40 PM"   5 1 1 43
    1205 "Mar 25, 2013 8:56:36 PM"  "Mar 25, 2013 10:36:42 PM"  7 1 1 48
    1206 "Mar 26, 2013 4:55:18 PM"  "Mar 26, 2013 5:21:42 PM"   4 1 1 52
    1207 "Mar 26, 2013 7:20:52 PM"  "Mar 26, 2013 7:48:00 PM"   4 1 1 56
    1208 "Mar 26, 2013 9:43:23 PM"  "Mar 26, 2013 10:12:12 PM"  5 1 1 42
    1209 "Mar 27, 2013 4:52:29 PM"  "Mar 27, 2013 5:15:38 PM"   4 1 1 62
    1210 "Mar 28, 2013 6:42:23 PM"  "Mar 28, 2013 7:06:09 PM"   6 1 1 64
    1211 "Mar 28, 2013 7:17:32 PM"  "Mar 28, 2013 7:37:03 PM"   3 1 1 51
    1212 "Apr 1, 2013 4:19:51 PM"   "Apr 1, 2013 4:47:20 PM"    3 1 1 54
    1213 "Apr 1, 2013 5:09:49 PM"   "Apr 1, 2013 5:36:37 PM"    5 1 1 42
    1214 "Apr 2, 2013 4:08:19 PM"   "Apr 2, 2013 4:42:30 PM"    4 1 1 62
    1215 "Apr 2, 2013 5:19:14 PM"   "Apr 2, 2013 5:46:37 PM"    3 1 1 41
    1216 "Apr 2, 2013 6:04:48 PM"   "Apr 2, 2013 6:34:14 PM"    5 1 1 46
    1217 "Apr 2, 2013 7:38:06 PM"   "Apr 2, 2013 8:13:52 PM"    4 1 1 33
    1218 "Apr 2, 2013 9:29:10 PM"   "Apr 2, 2013 9:56:32 PM"    4 1 1 50
    1219 "Apr 2, 2013 10:21:59 PM"  "Apr 2, 2013 11:04:21 PM"   4 1 1 31
    1220 "Apr 4, 2013 5:49:05 PM"   "Apr 4, 2013 6:09:30 PM"    4 1 1 42
    1221 "Apr 4, 2013 6:31:47 PM"   "Apr 4, 2013 7:08:14 PM"    5 1 1 45
    1222 "Apr 4, 2013 8:33:17 PM"   "Apr 4, 2013 9:00:34 PM"    3 1 1 39
    1223 "Apr 6, 2013 6:09:46 PM"   "Apr 6, 2013 6:37:57 PM"    5 1 1 39
    1224 "Apr 6, 2013 6:46:50 PM"   "Apr 6, 2013 7:05:25 PM"    5 1 1 38
    1225 "Apr 8, 2013 5:29:29 PM"   "Apr 8, 2013 5:51:20 PM"    4 1 1 28
    1226 "Apr 8, 2013 7:22:21 PM"   "Apr 8, 2013 7:50:28 PM"    7 1 1 62
    1227 "Apr 11, 2013 6:20:58 PM"  "Apr 11, 2013 6:54:05 PM"   6 1 1 59
    1228 "Apr 11, 2013 7:27:06 PM"  "Apr 11, 2013 7:41:45 PM"   5 1 1 40
    1229 "Apr 12, 2013 4:53:46 PM"  "Apr 12, 2013 9:07:34 PM"   7 1 1 57
    1230 "Apr 12, 2013 5:32:58 PM"  "Apr 12, 2013 6:09:27 PM"   6 1 1 37
    1231 "Apr 12, 2013 6:24:48 PM"  "Apr 12, 2013 6:57:29 PM"   4 1 1 35
    1232 "Apr 12, 2013 7:16:59 PM"  "Apr 12, 2013 7:45:15 PM"   4 1 2 45
    1235 "Apr 16, 2013 7:07:51 PM"  "Apr 17, 2013 12:08:18 AM"  6 1 1 35
    1236 "Apr 17, 2013 8:25:56 PM"  "Apr 17, 2013 12:09:50 AM"  4 1 1 49
    1237 "Apr 17, 2013 9:20:11 PM"  "Apr 17, 2013 12:12:18 AM"  4 1 1 55
    1238 "Apr 17, 2013 9:55:35 PM"  "Apr 17, 2013 12:13:17 AM"  3 1 1 42
    1239 "Apr 18, 2013 6:36:16 PM"  "Apr 18, 2013 9:13:07 PM"   3 1 2 50
    1240 "Apr 18, 2013 6:50:04 PM"  "Apr 18, 2013 9:15:16 PM"   6 1 1 55
    1241 "Apr 18, 2013 7:25:15 PM"  "Apr 18, 2013 9:18:07 PM"   3 1 1 65
    1242 "Apr 18, 2013 8:55:54 PM"  "Apr 18, 2013 10:25:53 PM"  4 1 1 42
    1243 "Apr 19, 2013 4:39:06 PM"  "Apr 19, 2013 8:42:39 PM"   4 1 1 40
    1244 "Apr 19, 2013 5:09:35 PM"  "Apr 19, 2013 10:56:55 PM"  5 1 1 48
    1245 "Apr 19, 2013 9:13:58 PM"  "Apr 19, 2013 11:00:48 PM"  5 8 1 59
    1246 "Apr 19, 2013 8:46:21 PM"  "Apr 19, 2013 11:01:42 PM"  5 1 1 34
    1247 "Apr 19, 2013 9:29:53 PM"  "Apr 19, 2013 11:02:38 PM"  3 1 1 34
    1301 "Mar 22, 2013 12:32:43 PM" "Mar 22, 2013 1:17:57 PM"   4 1 1 50
    1302 "Mar 22, 2013 2:11:35 PM"  "Mar 22, 2013 3:17:21 PM"   9 1 2 56
    1303 "Mar 22, 2013 4:44:50 PM"  "Mar 22, 2013 5:36:02 PM"   6 2 1 52
    1304 "Mar 22, 2013 5:49:51 PM"  "Mar 22, 2013 6:25:01 PM"   7 1 2 45
    1305 "Mar 23, 2013 10:36:41 AM" "Mar 23, 2013 11:14:21 AM"  8 1 1 62
    1308 "Mar 23, 2013 3:37:42 PM"  "Mar 23, 2013 4:01:31 PM"   4 2 1 38
    1311 "Mar 27, 2013 4:07:38 PM"  "Mar 28, 2013 7:53:40 PM"   3 1 1 32
    1313 "Mar 27, 2013 6:06:54 PM"  "Mar 27, 2013 6:24:29 PM"   4 1 1 30
    1314 "Mar 27, 2013 6:43:11 PM"  "Mar 27, 2013 7:10:50 PM"   5 1 1 62
    1318 "Apr 1, 2013 10:26:19 PM"  "Apr 1, 2013 10:49:39 PM"   4 1 2 40
    1322 "Apr 3, 2013 5:12:10 PM"   "Apr 3, 2013 5:32:53 PM"    4 1 1 36
    1325 "Apr 6, 2013 7:36:34 PM"   "Apr 6, 2013 10:02:57 PM"   5 1 1 60
    1327 "Apr 8, 2013 6:02:15 PM"   "Apr 8, 2013 6:16:55 PM"    4 1 1 52
    1328 "Apr 8, 2013 8:32:52 PM"   "Apr 8, 2013 8:54:19 PM"    5 1 1 45
    1329 "Apr 9, 2013 5:48:23 PM"   "Apr 9, 2013 10:21:17 PM"   4 1 1 40
    1330 "Apr 9, 2013 6:13:05 PM"   "Apr 9, 2013 11:11:21 PM"   3 1 1 55
    1331 "Apr 11, 2013 4:52:16 PM"  "Apr 11, 2013 5:34:25 PM"   3 1 1 41
    1342 "Apr 11, 2013 7:52:45 PM"  "Apr 11, 2013 8:15:08 PM"   4 1 1 50
    1343 "Apr 11, 2013 8:39:04 PM"  "Apr 11, 2013 9:02:47 PM"   5 1 1 34
    1344 "Apr 12, 2013 4:44:54 PM"  "Apr 12, 2013 5:02:54 PM"   3 1 2 38
    1345 "Apr 12, 2013 6:16:29 PM"  "Apr 12, 2013 7:54:32 PM"   4 1 1 34
    1346 "Apr 12, 2013 9:35:07 PM"  "Apr 12, 2013 9:47:25 PM"   4 1 1 40
    1401 "Mar 22, 2013 7:44:38 PM"  "Mar 22, 2013 8:45:23 PM"   6 2 1 64
    1403 "Mar 23, 2013 4:14:32 PM"  "Mar 23, 2013 5:51:48 PM"   4 2 1 40
    1404 "Mar 23, 2013 4:52:38 PM"  "Mar 23, 2013 5:21:14 PM"   6 2 1 45
    1405 "Mar 23, 2013 9:14:48 PM"  "Mar 23, 2013 9:37:01 PM"   3 1 2 35
    1406 "Mar 26, 2013 4:53:35 PM"  "Mar 26, 2013 5:17:06 PM"   2 1 1 66
    1407 "Mar 26, 2013 6:41:58 PM"  "Mar 26, 2013 7:09:05 PM"   3 2 1 45
    1408 "Mar 26, 2013 7:16:47 PM"  "Mar 26, 2013 7:32:47 PM"   3 2 1 26
    1409 "Mar 27, 2013 4:05:15 PM"  "Mar 27, 2013 4:29:44 PM"   4 1 1 38
    1410 "Mar 27, 2013 5:19:35 PM"  "Mar 27, 2013 5:51:26 PM"   4 2 1 38
    1411 "Mar 27, 2013 6:55:24 PM"  "Mar 27, 2013 7:21:50 PM"   4 2 1 41
    1412 "Mar 27, 2013 8:47:07 PM"  "Mar 27, 2013 9:06:08 PM"   3 1 2 42
    1413 "Mar 28, 2013 7:35:09 PM"  "Apr 2, 2013 8:30:23 PM"    4 8 2 59
    1414 "Apr 1, 2013 6:04:37 PM"   "Apr 1, 2013 6:32:57 PM"    6 1 1 56
    1415 "Apr 1, 2013 7:26:29 PM"   "Apr 1, 2013 7:47:43 PM"    5 1 1 43
    1416 "Apr 1, 2013 8:51:45 PM"   "Apr 1, 2013 9:07:22 PM"    5 2 1 55
    1417 "Apr 2, 2013 5:00:17 PM"   "Apr 2, 2013 5:21:28 PM"    4 1 1 32
    1418 "Apr 2, 2013 9:01:18 PM"   "Apr 2, 2013 9:27:13 PM"    4 2 1 40
    end

  • #2
    Thank you for proving a data example. The first step is to convert the start and end datetimes into Stata's internal formats. Once this is done, you have access to all the convenient datetime functions to perform whatever manipulations you need.

    The last one considers one such violation. Here I made an assumption since you haven't clearly defined what should happen if the survey duration falls over the start workday boundary (e.g., 9 AM or 6 PM). Therefore I consider 2 scenarios. Scenario one (-violation1-) flags any survey that starts after 6 PM or ends prior to 9 AM. The second scenario considers only surveys that start outside of the 9 AM to 6 PM window. This should give you an idea of where to start and how to make any changes you may need.

    I did not consider what happens if a survey collection spans multiple days, but that wasn't evident in your example data.

    Code:
    gen double start_dtm = clock(starttime, "MDYhms")
    gen double end_dtm = clock(starttime, "MDYhms")
    format start_dtm end_dtm %tc
    
    gen byte violation1 = hh(start_dtm) >= 18 | hh(end_dtm) <= 9
    gen byte violation2 = !inrange(hh(start_dtm), 9, 18)
    Last edited by Leonardo Guizzetti; 05 Feb 2024, 12:02.

    Comment


    • #3


      "Thank you, Leonardo.
      I've decided to focus solely on the second scenario, and your insights have provided me with valuable ideas. I appreciate your assistance."

      Comment

      Working...
      X