Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify date belonging to highest and lowest value and use in local

    Hi everyone,

    I am new to Stata, using version SE 15.1.

    In my dataset i'd like to 1) identify the time variable belonging to the highest value from another variable, and 2) identify the time variable belonging to the lowest value from another variable (when it's after the highest level and before it rises again). And then use both as cutoff points.
    Because my dataset is confidential, I provided a fake example instead.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(concentration date)
     20 22646
     30 22647
     50 22648
     55 22649
    100 22650
     90 22651
     75 22652
     70 22653
     20 22654
     10 22655
     20 22656
     30 22657
      5 22658
    end
    format %d date
    My data contains one variable (concentration) with a certrain value belonging to hormone level concentration. The other variable (time) belongs to the date that the blood sample was collected.
    The data above represents one person.
    So in the fake example the answer to my first question would be: 05jan2022. And the answer to my second question would be 10jan2022.

    Up to know I can't seem to figure out how to do this.

    With the following code I am able to create a new variable containing the highest value of the concentration variable. I then made a new variable containing the date belonging to the highest value. I can substract this from the variable date and create a new variable, time.
    Code:
    egen max = max(concentration) /*identify highest concentration*/
    generate startdate=date if concentration==max
    format startdate %td
    
    egen date_atmax=max(startdate) /* identify date belonging to highest concentration */
    format date_atmax %td
    drop startdate
    
    generate time=(date - date_atmax) /* then t=0 belongs to the highest point */
    Of course I can now drop the observations below 0. However, in my dataset it is wrong to say that the highest point is the starting point. The real starting point could be 06jan2022, depending on the different persons that are included. This time difference will not be the same for every person in my study.
    If I want to make a scatter plot or a regression between the highest and lowest value I'd like to use a local. The syntax below is of course not working, because the local is not from the variable time.
    Code:
    local start=date_atmax 
    scatter concentration time if inrange(time,`start',5)
    I initially wanted to create a local like this
    Code:
    local date_atmax=date if concentration==max
    But I cannot use if in a local.

    Is there any way I can identify the date belonging to the highest concentration more easily?
    I have not found a solution for the date belonging to the lowest concentration (where the date must be after the highest concentration, and before it start rising again).

    I hope my post is clear and if I did something in contrast to what the FAQ list displays please tell me.

    Thanks in advance.

  • #2
    I don't follow why you seek local macros here and I don't see that any are needed or would make the task easier. I imagine that you need to do this for each patient and so you need a systematic approach looping over individuals. Your way of getting the maximum concentration is fine, but here is another way. The trickier part is calculating conditionally on each patient having passed the peak. For more on the technique here see Section 9 of https://www.stata-journal.com/articl...article=dm0055

    Code:
    * Example generated by -dataex-
    
    clear
    input float(concentration date)
     20 22646
     30 22647
     50 22648
     55 22649
    100 22650
     90 22651
     75 22652
     70 22653
     20 22654
     10 22655
     20 22656
     30 22657
      5 22658
    end
    format %td date
    gen id = 1
    
    gen good = !missing(conc)
    
    bysort good id (concentration) : gen date_max = date[_N] if good
    bysort id (date_max) : replace date_max = date_max[1]
    bysort id (date): egen conc_min = min(cond(date > date_max, concentration, .))
    bysort id (date): egen date_min = min(cond(date > date_max & concentration == conc_min, date, .))
    format date_* %td
    
    list
    
    
    
    
         +---------------------------------------------------------------------+
         | concen~n        date   id   good    date_max   conc_min    date_min |
         |---------------------------------------------------------------------|
      1. |       20   01jan2022    1      1   05jan2022          5   13jan2022 |
      2. |       30   02jan2022    1      1   05jan2022          5   13jan2022 |
      3. |       50   03jan2022    1      1   05jan2022          5   13jan2022 |
      4. |       55   04jan2022    1      1   05jan2022          5   13jan2022 |
      5. |      100   05jan2022    1      1   05jan2022          5   13jan2022 |
         |---------------------------------------------------------------------|
      6. |       90   06jan2022    1      1   05jan2022          5   13jan2022 |
      7. |       75   07jan2022    1      1   05jan2022          5   13jan2022 |
      8. |       70   08jan2022    1      1   05jan2022          5   13jan2022 |
      9. |       20   09jan2022    1      1   05jan2022          5   13jan2022 |
     10. |       10   10jan2022    1      1   05jan2022          5   13jan2022 |
         |---------------------------------------------------------------------|
     11. |       20   11jan2022    1      1   05jan2022          5   13jan2022 |
     12. |       30   12jan2022    1      1   05jan2022          5   13jan2022 |
     13. |        5   13jan2022    1      1   05jan2022          5   13jan2022 |
         +---------------------------------------------------------------------+

    Comment


    • #3
      Sorry, that is only a start as there is still the condition before it rises again which was clearly stated.

      Code:
      * Example generated by -dataex-
      
      clear
      input float(concentration date)
       20 22646
       30 22647
       50 22648
       55 22649
      100 22650
       90 22651
       75 22652
       70 22653
       20 22654
       10 22655
       20 22656
       30 22657
        5 22658
      end
      format %td date
      gen id = 1
      
      gen good = !missing(conc)
      
      bysort good id (concentration) : gen date_max = date[_N] if good
      bysort id (date_max) : replace date_max = date_max[1]
      bysort id (date): gen sign = sign(concentration[_n-1] - concentration) 
      by id : gen phase = sum(sign != sign[_n-1]) if date > date_max 
      by id: egen conc_min = min(cond(date > date_max & phase == 1, concentration, .))
      by id: egen date_min = min(cond(date > date_max & concentration == conc_min & phase == 1, date, .))
      format date_* %td
      
      list
      
           +------------------------------------------------------------------------------------+
           | concen~n        date   id   good    date_max   sign   phase   conc_min    date_min |
           |------------------------------------------------------------------------------------|
        1. |       20   01jan2022    1      1   05jan2022      .       .         10   10jan2022 |
        2. |       30   02jan2022    1      1   05jan2022     -1       .         10   10jan2022 |
        3. |       50   03jan2022    1      1   05jan2022     -1       .         10   10jan2022 |
        4. |       55   04jan2022    1      1   05jan2022     -1       .         10   10jan2022 |
        5. |      100   05jan2022    1      1   05jan2022     -1       .         10   10jan2022 |
           |------------------------------------------------------------------------------------|
        6. |       90   06jan2022    1      1   05jan2022      1       1         10   10jan2022 |
        7. |       75   07jan2022    1      1   05jan2022      1       1         10   10jan2022 |
        8. |       70   08jan2022    1      1   05jan2022      1       1         10   10jan2022 |
        9. |       20   09jan2022    1      1   05jan2022      1       1         10   10jan2022 |
       10. |       10   10jan2022    1      1   05jan2022      1       1         10   10jan2022 |
           |------------------------------------------------------------------------------------|
       11. |       20   11jan2022    1      1   05jan2022     -1       2         10   10jan2022 |
       12. |       30   12jan2022    1      1   05jan2022     -1       2         10   10jan2022 |
       13. |        5   13jan2022    1      1   05jan2022      1       3         10   10jan2022 |
           +------------------------------------------------------------------------------------+

      Comment

      Working...
      X