Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to compare a date to determine if it is within the range

    Hello all,

    Thanks to Joseph Coveney's help, I was able to put my data in the following format. Then I merged it with "visits" data.

    Now I have over 20K of data in the following format.
    ln4fn4dob pat_lname pat_fname visit_date auth_begdate_1 auth_enddate_1 auth_begdate_2 auth_enddate_2 auth_begdate_3 auth_enddate_3 LineNo
    abralily27590 Abramson Lily 1/30/2022 1/1/2022 1/31/2022 2/6/2022 3/8/2022 - - 1
    abralily27590 Abramson Lily 2/1/2022 1/1/2022 1/31/2022 2/6/2022 3/8/2022 - - 1
    I want to determine if the visit_date is within any of the auth_begdate and auth_enddate pairs and put a "Y" or "N" as follows:
    ln4fn4dob pat_lname pat_fname visit_date auth_begdate_1 auth_enddate_1 auth_begdate_2 auth_enddate_2 auth_begdate_3 auth_enddate_3 LineNo Authorized
    abralily27590 Abramson Lily 1/30/2022 1/1/2022 1/31/2022 2/6/2022 3/8/2022 - - 1 Y
    abralily27590 Abramson Lily 2/1/2022 1/1/2022 1/31/2022 2/6/2022 3/8/2022 - - 1 N
    My searches for a command or code did not get me closer to a solution.

    I thought of two forvalues loops, the outer loop to go through each of the 20K lines, and the inner loop to evaluate if the visit_date falls within the beginning and end dates. (I have from one to eight pairs of dates for each person.)

    Any hints would be much appreciated.

    Thanks!

    Artur



  • #2
    Your data set is a dysfunctional mess as presented. You have strings for dates, which are pretty much impossible to work with, especially for determining chronological order. And you have adopted "- " as a substitute for a missing value. So that needs to be fixed before you can resolve this problem, or, frankly, do anything useful with this data.

    Next, you have the data in wide layout. While it is possible to resolve this problem in wide layout, it requires a loop (just one, not two--in Stata it is almost never necessary to loop over observations). By contrast, your problem can be solved in one line of code with long layout.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str14 ln4fn4dob str9 pat_lname str5 pat_fname str10 visit_date str9 auth_begdate_1 str10 auth_enddate_1 str9(auth_begdate_2 auth_enddate_2) str2(auth_begdate_3 auth_enddate_3) byte lineno
    "abralily27590 " "Abramson " "Lily " "1/30/2022 " "1/1/2022 " "1/31/2022 " "2/6/2022 " "3/8/2022 " "- " "- " 1
    "abralily27590 " "Abramson " "Lily " "2/1/2022 "  "1/1/2022 " "1/31/2022 " "2/6/2022 " "3/8/2022 " "- " "- " 1
    end
    
    //    GO TO LONG LAYOUT
    gen `c(obs_t)' obs_no = _n
    reshape long auth_begdate_ auth_enddate_, i(obs_no)
    
    //    CLEAN UP THE DATE VARIABLES
    foreach v of varlist *date* {
        replace `v' = "" if inlist(trim(itrim(`v')), "-", "")
        gen _`v' = daily(`v', "MDY")
        assert missing(_`v') == missing(`v')
        format _`v' %td
        drop `v'
        rename _`v' `v'
    }
    
    by obs_no (_j), sort: egen wanted = ///
        max(inrange(visit_date, auth_begdate_, auth_enddate_) & !missing(auth_begdate_))
    Now, I have left the data set in long layout. I do that because it is likely that whatever else you are going to do with this data will also be best done, or even only possible, in long layout. Most Stata commands are optimized for long data. If you know for a fact that you will next be doing things that can only be done with the wide layout you started with, then you can get back to that arrangement by running -reshape wide-. But only do this if you know for sure that is the case. Otherwise your next post will probably be asking for help with something that seems complicated and the response will be to go once again to long data and then do something very easy.

    I do think I remember seeing your thread where Joseph Coveney advised you to use -reshape wide- on your data to produce the layout you wanted. I remember being tempted at the time to join the thread and say "be careful what you wish for." I decided to stay out of it, and now I regret that decision.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have done in this response. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    Comment


    • #3
      Clyde Schechter is correct: The data are a mess (plus the long variable names are ugly to work with) and you don't need to loop over observations.

      But I don't understand why you need a long layout if the goal is to assign a "yes" to "wanted" (or "authorized") if you want to know if a visitation date is authorized in any of the begin- and end-dates (hence if you use -egen wanted = max(...)-. In fact, you need long format (but not -egen-) if you want to know the authorization for each beg- and end-date separately, e.g.
      Code:
      gen wanted = ///
          inrange(visit_date, auth_begdate_, auth_enddate_) & !missing(auth_begdate_)
      resulting in
      Code:
      . list, noob    
      
        +-----------------------------------------------------------------------------------------------------------+
        | obs_no   _j        ln4fn4dob   pat_lname   pat_fn~e   lineno   visit_d~e   auth_be~_   auth_en~_   wanted |
        |-----------------------------------------------------------------------------------------------------------|
        |      1    1   abralily27590    Abramson       Lily         1   30jan2022   01jan2022   31jan2022        1 |
        |      1    2   abralily27590    Abramson       Lily         1   30jan2022   06feb2022   08mar2022        0 |
        |      1    3   abralily27590    Abramson       Lily         1   30jan2022           .           .        0 |
        |      2    1   abralily27590    Abramson       Lily         1   01feb2022   01jan2022   31jan2022        0 |
        |      2    2   abralily27590    Abramson       Lily         1   01feb2022   06feb2022   08mar2022        0 |
        |-----------------------------------------------------------------------------------------------------------|
        |      2    3   abralily27590    Abramson       Lily         1   01feb2022           .           .        0 |
        +-----------------------------------------------------------------------------------------------------------+
      I may be wrong, but I believe that if you want to know if a visit is authorized in any of the begin- and end-dates, wide format would be possible, as well:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      input str13 ln4fn4dob str8 pat_lname str4 pat_fname str10(visit_date auth_begdate_1 auth_enddate_1 auth_begdate_2 auth_enddate_2 auth_begdate_3 auth_enddate_3) byte LineNo
      "abralily27590" "Abramson" "Lily" "1/30/2022" "1/1/2022" "1/31/2022" "2/6/2022" "3/8/2022" "-"        "-"        1
      "abralily27590" "Abramson" "Lily" "2/1/2022"  "1/1/2022" "1/31/2022" "2/6/2022" "3/8/2022" "-"        "-"        1
      "abralily27590" "Abramson" "Lily" "1/2/2022"  "1/1/2022" "1/31/2022" "2/6/2022" "3/8/2022" "2/6/2022" "3/8/2022" 1
      "abralily27590" "Abramson" "Lily" "2/1/2022"  "1/1/2022" "1/31/2022" "2/6/2022" "3/8/2022" "2/6/2022" "3/8/2022" 1
      end
      
      foreach v of varlist visit_date-auth_enddate_3 {
         gen `v'_d = date(`v',"MDY")
         format `v'_d %td
      }
      
      gen authorized = (!mi(auth_begdate_1_d,auth_enddate_1_d) & ///
                        inrange(visit_date_d,auth_begdate_1_d,auth_enddate_1_d)) | ///
                       (!mi(auth_begdate_2_d,auth_enddate_2_d) & ///
                        inrange(visit_date_d,auth_begdate_2_d,auth_enddate_2_d)) | ///
                       (!mi(auth_begdate_3_d,auth_enddate_3_d) & ///
                        inrange(visit_date_d,auth_begdate_3_d,auth_enddate_3_d))
      lab def authorized 0 "no" 1 "yes"
      lab val authorized authorized
      resulting in
      Code:
      . list ln4fn4dob *_d authorized, noob
      
        +--------------------------------------------------------------------------------------------------------------+
        |     ln4fn4dob   visit_d~d   auth_be..   auth_en..   auth_be..   auth_en..   auth_be..   auth_en..   author~d |
        |--------------------------------------------------------------------------------------------------------------|
        | abralily27590   30jan2022   01jan2022   31jan2022   06feb2022   08mar2022           .           .        yes |
        | abralily27590   01feb2022   01jan2022   31jan2022   06feb2022   08mar2022           .           .         no |
        | abralily27590   02jan2022   01jan2022   31jan2022   06feb2022   08mar2022   06feb2022   08mar2022        yes |
        | abralily27590   01feb2022   01jan2022   31jan2022   06feb2022   08mar2022   06feb2022   08mar2022         no |
        +--------------------------------------------------------------------------------------------------------------+

      Comment

      Working...
      X