Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find whether the variable has remained unchanged for more than 3 months

    Hello, I want to find whether the variable has remained unchanged for more than 3 months and generate a binary variable A, (if remained: A == 1, if not: A == 1)
    My data contains more than 40,000 unique IDs and more than 1,000,000 observations, so it was hard to find a good way to make the variable I wanted.
    The below table includes each case I want to handle, and in the real dataset, it is much bigger.
    ID condition date A
    111 1 01Jan13 1
    111 1 05May13 1
    111 1 20Jun13 1
    111 0 01Dec14 0
    222 0 03Jan15 0
    222 1 05May15 0
    222 0 29May15 0
    222 1 30Jun15 0
    333 0 01Jan08 0
    333 1 05Dec09 1
    333 1 31Jun09 1
    333 1 01Oct09 1
    444 0 03Jul06 0
    444 1 06Jan07 1
    444 1 03Jun11 1
    555 0 06Jan19 0
    555 1 05Sep21 0
    555 1 15Sep21 0
    555 1 16Sep21 0
    666 0 17Jan11 0
    666 1 05May12 1
    666 1 16May12 1
    666 1 19Dec12 1
    666 1 21Dec13 1
    ID, condition, and date are the variables I have in my dataset, and variable A is the result I want.
    Also, the date variable is already in Stata format.
    I have been struggling with this question for days, so would you please help me get the result I want with handling those specific cases?
    (I've tried to solve it with a complicated loop, but it was not working cause my data was too big)

    Thanks.
    Last edited by Somi Kim; 02 Mar 2024, 06:21.

  • #2
    This raises several questions on different levels, but a positive answer also seems possible.

    1. Please note https://www.statalist.org/forums/help#stata featuring our longstanding request to use dataex to give data examples, which as flagged there is especially important for date variables, which otherwise require surgery before members here can work usefully with your example. It is no use telling us that the date is already in Stata format, whatever that means precisely.

    2. Some details in your example make little sense. In the case of 333 there was no date 31jun09 and the dates appear out of order otherwise.

    3. There is necessarily no information on what happened after the last date for each identifier, so it's hard -- indeed impossible -- to say whether there was change within 3 months. You may want to change the rule to something like "no change so far as is known".

    4. If there is no observation within 3 months, do we have any information to answer the question? This point echoes #3.

    Here I used rangestat from SSC. 3 months I have taken to mean 91 days. Again, with daily data you need a precise rule there.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int d byte condition str7 date byte a
    111 1 "01Jan13" 1
    111 1 "05May13" 1
    111 1 "20Jun13" 1
    111 0 "01Dec14" 0
    222 0 "03Jan15" 0
    222 1 "05May15" 0
    222 0 "29May15" 0
    222 1 "30Jun15" 0
    333 0 "01Jan08" 0
    333 1 "05Dec09" 1
    333 1 "31Jun09" 1
    333 1 "01Oct09" 1
    444 0 "03Jul06" 0
    444 1 "06Jan07" 1
    444 1 "03Jun11" 1
    555 0 "06Jan19" 0
    555 1 "05Sep21" 0
    555 1 "15Sep21" 0
    555 1 "16Sep21" 0
    666 0 "17Jan11" 0
    666 1 "05May12" 1
    666 1 "16May12" 1
    666 1 "19Dec12" 1
    666 1 "21Dec13" 1
    end
    
    gen betterdate = daily(date, "DMY", 2025)  
    
    drop date 
    
    rename betterdate date 
    
    egen last = max(date), by(d)
    
    rangestat (count) count=condition (min) min=condition (max) max=condition, by(d) int(date 0 91) 
    
    gen wanted1 = min == max if date < last 
    
    gen wanted2 = min == max if count > 1 & date < last 
    
    list d date last condition count wanted?, sepby(d) 
    
        +------------------------------------------------------------+
         |   d    date    last   condit~n   count   wanted1   wanted2 |
         |------------------------------------------------------------|
      1. | 111   19359   20058          1       1         1         . |
      2. | 111   19483   20058          1       2         1         1 |
      3. | 111   19529   20058          1       1         1         . |
      4. | 111   20058   20058          0       1         .         . |
         |------------------------------------------------------------|
      5. | 222   20091   20269          0       1         1         . |
      6. | 222   20213   20269          1       3         0         0 |
      7. | 222   20237   20269          0       2         0         0 |
      8. | 222   20269   20269          1       1         .         . |
         |------------------------------------------------------------|
      9. | 333   17532   18236          0       1         1         . |
     10. | 333   18236   18236          1       1         .         . |
     11. | 333       .   18236          1       .         .         . |
     12. | 333   18171   18236          1       2         1         1 |
         |------------------------------------------------------------|
     13. | 444   16985   18781          0       1         1         . |
     14. | 444   17172   18781          1       1         1         . |
     15. | 444   18781   18781          1       1         .         . |
         |------------------------------------------------------------|
     16. | 555   21555   22539          0       1         1         . |
     17. | 555   22528   22539          1       3         1         1 |
     18. | 555   22538   22539          1       2         1         1 |
     19. | 555   22539   22539          1       1         .         . |
         |------------------------------------------------------------|
     20. | 666   18644   19713          0       1         1         . |
     21. | 666   19118   19713          1       2         1         1 |
     22. | 666   19129   19713          1       1         1         . |
     23. | 666   19346   19713          1       1         1         . |
     24. | 666   19713   19713          1       1         .         . |
         +------------------------------------------------------------+

    Comment

    Working...
    X