Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fiter a date/time variable

    Dear Stata Experts,
    I am importing a dataset with participants' sign-in date (var name: SignInDate) in string and I was able to successfully convert it into Stata's date/time variable (var name: login_datetime) using the codes below.

    Code:
    gen double login_datetime=clock(SignInDate, "MDYhms")
    format login_datetime %tc
    The codes generated the following outputs. Here I included the first 50 observations from my dataset.

    Code:
    clear
    input str24 SignInDate double login_datetime
    "Oct 12, 2017 5:57:53 PM"  1823450273000
    "Jun 2, 2021 1:51:12 PM"   1938261072000
    "Apr 1, 2022 12:38:34 PM"  1964435914000
    "Mar 28, 2022 1:53:06 PM"  1964094786000
    "Apr 23, 2019 7:39:08 PM"  1871667548000
    "Mar 1, 2022 12:26:55 PM"  1961756815000
    "Jan 16, 2020 10:40:56 AM" 1894790456000
    "Jun 24, 2021 9:21:04 AM"  1940145664000
    "Dec 19, 2018 11:50:06 AM" 1860839406000
    "Nov 20, 2018 10:00:52 AM" 1858327252000
    "Oct 12, 2018 10:25:07 AM" 1854959107000
    "Apr 19, 2022 11:54:19 AM" 1965988459000
    "Jul 24, 2020 12:07:30 PM" 1911211650000
    "Apr 21, 2022 2:05:55 PM"  1966169155000
    "Mar 23, 2018 10:36:05 AM" 1837420565000
    "Oct 14, 2020 3:49:07 PM"  1918309747000
    "Feb 27, 2019 10:18:24 PM" 1866925104000
    "May 20, 2021 10:37:11 AM" 1937126231000
    "Mar 5, 2021 1:48:15 PM"   1930571295000
    "Mar 6, 2018 5:08:38 PM"   1835975318000
    "Jan 16, 2022 10:03:02 PM" 1957989782000
    "Feb 22, 2022 4:36:53 PM"  1961167013000
    "Apr 12, 2022 2:40:10 PM"  1965393610000
    "Apr 22, 2022 10:13:00 AM" 1966241580000
    "Nov 19, 2018 3:12:34 PM"  1858259554000
    "Aug 26, 2018 11:56:31 PM" 1850946991000
    "Jul 24, 2019 1:08:57 PM"  1879592937000
    "Feb 15, 2022 12:03:00 PM" 1960545780000
    "Jun 1, 2021 6:40:13 PM"   1938192013000
    "Jun 2, 2021 9:40:32 AM"   1938246032000
    "Feb 17, 2022 4:23:39 PM"  1960734219000
    "Feb 28, 2022 1:12:47 PM"  1961673167000
    "Apr 20, 2022 2:16:19 PM"  1966083379000
    "May 31, 2019 10:18:39 AM" 1874917119000
    "Jun 4, 2018 11:49:40 AM"  1843732180000
    "Feb 2, 2021 3:11:28 PM"   1927897888000
    "Jan 26, 2018 11:56:22 AM" 1832586982000
    "Jan 26, 2018 12:49:58 PM" 1832590198000
    "Oct 8, 2021 12:05:05 PM"  1949313905000
    "Aug 26, 2019 9:43:37 AM"  1882431817000
    "Jul 30, 2021 2:03:43 PM"  1943273023000
    "Apr 14, 2022 2:28:57 PM"  1965565737000
    "Apr 22, 2022 10:35:05 AM" 1966242905000
    "Jun 27, 2019 2:38:25 PM"  1877265505000
    "Apr 12, 2022 5:40:55 PM"  1965404455000
    "Apr 19, 2022 2:20:01 PM"  1965997201000
    "Jun 22, 2019 1:30:09 PM"  1876829409000
    "Mar 3, 2020 2:06:28 PM"   1898863588000
    "Jul 9, 2021 9:30:01 AM"   1941442201000
    "Apr 20, 2019 4:52:08 PM"  1871398328000
    end
    format %tc login_datetime
    I'm trying to identify individuals whose login date was before Oct 31, 2021 so I generated these codes below:

    Code:
    list FirstName LastName SignInDate login_datetime if login_datetime<tc(31 oct 2021 00:00:00)
    But it did not filter the date/time correctly and list all observations as if their login dates were all before Oct 31, 2021 (see example below). Can someone please tell me what might have gone wrong with my codes?

    +-----------------------------------------------+
    | SignInDate login_datetime |
    |-----------------------------------------------|
    1. | Oct 12, 2017 5:57:53 PM 12oct2017 17:57:53 |
    2. | Jun 2, 2021 1:51:12 PM 02jun2021 13:51:12 |
    5. | Apr 23, 2019 7:39:08 PM 23apr2019 19:39:08 |
    7. | Jan 16, 2020 10:40:56 AM 16jan2020 10:40:56 |
    8. | Jun 24, 2021 9:21:04 AM 24jun2021 09:21:04 |
    |-----------------------------------------------|
    9. | Dec 19, 2018 11:50:06 AM 19dec2018 11:50:06 |
    10. | Nov 20, 2018 10:00:52 AM 20nov2018 10:00:52 |
    11. | Oct 12, 2018 10:25:07 AM 12oct2018 10:25:07 |
    13. | Jul 24, 2020 12:07:30 PM 24jul2020 12:07:30 |
    15. | Mar 23, 2018 10:36:05 AM 23mar2018 10:36:05 |
    |-----------------------------------------------|
    16. | Oct 14, 2020 3:49:07 PM 14oct2020 15:49:07 |
    17. | Feb 27, 2019 10:18:24 PM 27feb2019 22:18:24 |
    18. | May 20, 2021 10:37:11 AM 20may2021 10:37:11 |
    19. | Mar 5, 2021 1:48:15 PM 05mar2021 13:48:15 |
    20. | Mar 6, 2018 5:08:38 PM 06mar2018 17:08:38 |
    |-----------------------------------------------|
    25. | Nov 19, 2018 3:12:34 PM 19nov2018 15:12:34 |
    26. | Aug 26, 2018 11:56:31 PM 26aug2018 23:56:31 |
    27. | Jul 24, 2019 1:08:57 PM 24jul2019 13:08:57 |
    29. | Jun 1, 2021 6:40:13 PM 01jun2021 18:40:13 |
    30. | Jun 2, 2021 9:40:32 AM 02jun2021 09:40:32 |
    |-----------------------------------------------|
    34. | May 31, 2019 10:18:39 AM 31may2019 10:18:39 |
    35. | Jun 4, 2018 11:49:40 AM 04jun2018 11:49:40 |
    36. | Feb 2, 2021 3:11:28 PM 02feb2021 15:11:28 |
    37. | Jan 26, 2018 11:56:22 AM 26jan2018 11:56:22 |
    38. | Jan 26, 2018 12:49:58 PM 26jan2018 12:49:58 |
    |-----------------------------------------------|


    I would really appreciate anyone's feedback!

    Thanks!

  • #2
    No, it's not listing all your observations. Look closely. Just even at the very beginning notice that observations 3, 4, and 6 are not listed (which is appropriate because they are after your specified cutoff.) Nor is 12. And so on. In fact, with your example data it finds 33 of the 50 observations that pass your filter, and 17 that do not. And by my manual count, that's exactly right.

    Comment


    • #3
      Thank you so much Clyde Schechter for pointing this out. I got fooled by the list which listed the last observation of the dataset and thought it is listing all observations. Really appreciate your help.

      Comment

      Working...
      X