Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating max change within group

    I am trying to calculate the max change in titer between two date points. I am only interested in positive change (increase) in titer from previous titer. I want to return the max positive change in titer and the date when the associated with that max change. For example for id 1002, I want to return 37762.7 (max positive change) and 20nov2021 as the date associated with the max change. I know I can calculate each change say for example between [_n] and [_n-1] and then pick the max of all changes but I'm guessing there should be an more direct and quicker way to go around it.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(id date) double titer
    1002 22576    1237
    1002 22604 38999.7
    1002 22660 23860.9
    1002 22688 16324.7
    1003 22576  1047.2
    1003 22604 11671.1
    1003 22632  6773.4
    1003 22660  4807.9
    1003 22688  3728.4
    1004 22576    1218
    1004 22604  1055.5
    1004 22660  1011.8
    1004 22688   799.8
    1006 22590 17043.7
    1006 22618   11519
    1006 22653  8705.2
    1006 22674  9434.6
    1010 22576  1571.4
    1010 22604 44310.5
    1010 22632 36383.5
    1010 22660 34230.2
    1010 22688 23310.7
    1013 22632 22698.2
    1014 22590   109.1
    1014 22618    91.2
    1014 22660  4419.5
    1014 22674  3779.8
    1015 22576 15155.4
    1015 22604 11298.2
    1015 22632  8687.6
    1015 22660  8615.3
    1015 22688  7117.2
    1017 22576  1683.1
    1017 22604 19150.9
    1017 22632 10715.9
    1017 22660 10461.9
    1017 22688  7479.3
    1019 22562  1307.4
    1023 22576 42075.2
    1023 22604 37240.6
    1023 22632 31630.4
    1023 22660   33234
    1023 22688   30187
    1024 22576     165
    1024 22604   177.3
    1024 22632   171.4
    1024 22660   50000
    1024 22688   50000
    1029 22562   911.2
    1029 22590     647
    1029 22618  3839.5
    1029 22653 33989.7
    1029 22674 25002.9
    1030 22590 12722.1
    1030 22618 43168.9
    1030 22653 19063.9
    1030 22674 14268.2
    1034 22604   173.5
    1034 22632     160
    1034 22660   170.1
    1034 22688   167.1
    1055 22562  7015.9
    1055 22590  5013.6
    1055 22618  3243.2
    1055 22653  2171.8
    1055 22674    1864
    1062 22576  1692.7
    1062 22604  1410.4
    1062 22632  1058.5
    1062 22660  1152.9
    1062 22688    1106
    1065 22576   40226
    1065 22604 32992.2
    1065 22632 28612.1
    1065 22674 29142.3
    1074 22562   371.6
    1075 22576   517.3
    1075 22604   472.3
    1075 22632   479.4
    1075 22660   443.3
    1075 22688  5160.8
    1079 22562  2539.9
    1079 22590  2013.5
    1079 22618  1574.2
    1079 22653   50000
    1079 22674   50000
    1081 22562  3707.4
    1081 22590  2788.3
    1081 22618   50000
    1081 22653 46841.5
    1081 22674   50000
    1086 22562 43792.2
    1086 22590 29634.4
    1086 22618 20749.2
    1086 22653 14729.9
    1086 22674 15292.3
    1088 22562   50000
    1088 22590 31583.1
    1088 22618 38785.5
    1088 22653 36391.9
    end
    format %td date
    Thanks

  • #2
    Originally posted by David Wambui View Post
    I know I can calculate each change say for example between [_n] and [_n-1] and then pick the max of all changes but I'm guessing there should be an more direct and quicker way to go around it.
    Because egen does not support subscripting, you need to do this in several steps as you describe.

    Code:
    bys id (date): gen negchange= -(titer- titer[_n-1])
    bys id (negchange): gen wanted= date[1] if !missing(negchange[1])
    bys id: egen maxchange= total(abs(negchange)*(wanted==date)) if !missing(wanted)
    format wanted %td
    Last edited by Andrew Musau; 22 Feb 2022, 09:53.

    Comment


    • #3
      Thanks Andrew. This works great.

      Comment


      • #4
        Let's spell out what Andrew Musau is doing here. At first sight, it would be simplest to find the largest positive change, but that is not going to work quite as you want

        1. always, for the first observation for each identifier has no predecessor and so change will be reported as missing, which will sort to the end of any block of observations

        2. possibly if there are missing values elsewhere.

        So Andrew negates the change and look for the first (lowest negative OR highest positive) change after sorting.

        Just possibly there is a need to spell out that the wanted change will be negative

        Code:
         
         bys id (negchange): gen wanted= date[1] if negchange[1] < 0

        Comment


        • #5
          Nick Cox the negating of the change is what works magic here. Andrew Musau, I think that last part you meant "format wanted %td"

          Comment


          • #6
            Originally posted by Nick Cox View Post

            Just possibly there is a need to spell out that the wanted change will be negative

            Code:
            bys id (negchange): gen wanted= date[1] if negchange[1] < 0
            Thanks Nick for the explanation and making the point that the smallest negated difference may be positive (implying no positive changes). This did not cross my mind.



            Andrew Musau, I think that last part you meant "format wanted %td"
            Thanks David. I had realized that and corrected the code.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Let's spell out what Andrew Musau is doing here. At first sight, it would be simplest to find the largest positive change, but that is not going to work quite as you want

              1. always, for the first observation for each identifier has no predecessor and so change will be reported as missing, which will sort to the end of any block of observations

              2. possibly if there are missing values elsewhere.

              So Andrew negates the change and look for the first (lowest negative OR highest positive) change after sorting.

              Just possibly there is a need to spell out that the wanted change will be negative

              Code:
              bys id (negchange): gen wanted= date[1] if negchange[1] < 0
              Thanks Nick Cox. Since I'm only interested in a positive change (increase in titer over time) I had to tweak Andrew Musau's code a little for the results I wanted.

              Code:
              bys id (negchange): gen highest_incr = -(negchange) if _n==1 & negchange < 0
              bys id (negchange): replace highest_incr = highest_incr[1] if highest_incr==.

              Comment

              Working...
              X