Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating new variable

    Hello,

    I am working with a dataset with glomerular filtration values (eGFR) and their dates. I need to create the variable CKD, which is =1 on the first date on which the criterion of having been at least 90 days with an eGFR<60 is met. It is not enough to consider only the immediately previous analytical determination, because it may be the case that this is less than 90 days ago, so it would be necessary to take into account the determinations prior to this one.

    Thank you!

    Here is the data example:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id date eGFR)
    1 18562  47.32877
    1 18667  79.49444
    1 18791  55.90679
    1 18962  80.51045
    1 19030  75.23867
    1 19100  71.27572
    1 19325  68.78225
    1 19432  63.01459
    1 19656  79.49444
    1 19757  58.82726
    1 19898  67.57719
    1 19969  76.62321
    1 20139  68.78225
    1 20278  56.86056
    1 20314  54.97198
    1 20395  81.03076
    1 20524  71.27572
    1 20555  66.39872
    1 20733  71.27572
    1 20804 65.245995
    1 20894  68.78225
    1 20955  79.49444
    1 21105  51.41184
    1 21236 70.014786
    1 21388  57.83385
    1 21482  64.11821
    1 21661  81.55958
    1 21935  75.23867
    1 22020 65.245995
    1 22134  84.92532
    1 22145  91.40887
    1 22148  82.64371
    1 22182  79.49444
    1 22362  59.84139
    1 22456  58.82726
    1 22698  58.82726
    1 22957  79.99842
    1 22959  81.55958
    1 22963  82.09715
    1 22971  76.62321
    2 21510  62.95226
    2 21559  52.83544
    2 21664  65.25058
    2 21846  58.64754
    2 21930  61.84075
    2 22088  52.83544
    2 22385  47.68377
    2 22557  53.75511
    2 22743  53.75511
    2 22841  67.65491
    3 20045   63.8017
    3 20111  62.65739
    3 20157  58.32484
    3 20235  50.68774
    3 20360  59.37281
    3 20482   57.2991
    3 20573  55.31163
    3 20710  59.37281
    3 20830  49.81756
    3 20992   46.5021
    end
    format %dCY-N-D date

  • #2
    I'm not sure I fully understand the variable you need to create. For the data extract you provided, would it be possible for you to create another column with the values you would like the CKD variable to take?

    Comment


    • #3
      Thank you for your reply,
      The new dichotomous variable CKD (chronic kidney disease) indicates the presence of an eGFR <60 for at least 90 days. Therefore, it should be =1 on the first date this is met. This can be met with 2 consecutive eGRF determinations >90 days apart, but also, for example, with 3 consecutive eGFR determinations separated by 2 months each (in this case CKD=1 would be fulfilled in the third determination, since >90 days have passed and in the second determination only 60).

      I hope I have clarified the point, which I do not find easy to explain.

      Comment


      • #4
        Thanks, Javier. Again, that explanation would be made much clearer if you could fulfil my previous request, which is to show us the values of CKD for your data example.

        Comment


        • #5
          Thanks Hemanshu,
          My doubt is how to create that variable, so I am not able to create the new column (that is my question). The CKD variable must be dichotomous, =1 if it meets the mentioned criteria and =0 in the rest of rows.

          Comment


          • #6
            One easy way would be to change the input line to

            Code:
            input float(id date eGFR) byte CKD
            and then at the end of each of the data lines, just add a space, followed by a 1 or 0 as appropriate, e.g.

            Code:
            1 18562  47.32877 0
            in the first row of data.

            If it is painful to do this for your entire data extract, just delete a bunch of the rows till you have a smaller extract which shows enough rows for the various types of situations to be illustrated. I reckon about 10 or so rows of a single id ought to suffice.

            Comment


            • #7
              Thank you very much for the support, here is the code:

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input byte id int date float eGFR byte CKD
              1 18562 47 0
              1 18667 79 0
              1 18791 56 0
              1 18962 81 0
              1 19030 56 0
              1 19100 44 0
              1 19325 35 1
              1 19432 29 0
              1 19656 12 0
              2 21510 63 0
              2 21559 53 0
              2 21664 65 0
              2 21846 62 0
              2 21930 59 0
              2 22088 53 1
              2 22385 48 0
              2 22557 54 0
              2 22743 54 0
              2 22841 68 0
              end
              format %tdnn/dd/CCYY date

              Comment


              • #8
                Perhaps this does your job:
                Code:
                sort id date
                gen byte is_under = (eGFR < 60)
                
                gen byte new_run = !(is_under[_n-1] & id == id[_n-1]) if is_under
                gen int run_no = sum(new_run) if is_under
                
                bysort run_no (date): gen int days_under = date - date[1] if is_under
                bysort id (date): gen int sum_days_under_crossed_90 = sum(days_under > 90) if is_under
                
                gen byte wanted = (sum_days_under_crossed_90 == 1)
                drop is_under new_run run_no days_under sum_days_under_crossed_90
                You can verify that the wanted variable produces the same values as your CKD for the data extract.
                Last edited by Hemanshu Kumar; 03 Jun 2024, 04:03.

                Comment


                • #9
                  Thank you so much, the code works perfectly!

                  Comment

                  Working...
                  X