Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a difference variable in panel

    Hi folks,

    I have a panel data of paitents in which each patient tracks its own blood pressure levels.

    The time and day are determined by each individual, and some patients measure 3-4 times a day while some patients log only once a week.

    I want to create a variable for systolic and diastolic that differences the preceding number in a chronological way.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte patientid str14 time int sys byte dia
    1 "2023-11-17 23 " 123  77
    1 "2023-11-18 08 " 135  80
    1 "2023-11-18 15"  128  70
    2 "2023-11-11 23 " 117  68
    2 "2023-11-18 08 " 160 100
    3 "2023-11-05 23 " 155  90
    4 "2023-11-21 23 " 140  88
    4 "2023-11-22 07 " 121  72
    4 "2023-11-23 15"  120  70
    end
    What is the best way to create a new variable that tracks the changes in the systolic and diastolic numbers in a chronological order (e.g., latest - the second latest, the second latest - the third latest, and so on).

    Appreciate your help!

  • #2
    Originally posted by Stephen Ch View Post
    I want to create a variable for systolic and diastolic that differences the preceding number in a chronological way.

    What is the best way to create a new variable that tracks the changes in the systolic and diastolic numbers in a chronological order
    Your date-time variable seems to follow ISO 8601, and so it will sort nicely even as a string, provided that your snippet is representative. Try something like the following.
    Code:
    bysort patientid (time): generate int des = sys - sys[_n-1] if _n > 1
    by patientid: generate byte ded = dia - dia[_n-1] if _n > 1

    Comment


    • #3
      To Joseph Coveney's excellent advice, I will add this:

      I don't know what you ultimately plan to do with this data. But it is difficult to imagine that you won't at some point want to know the elapsed time between consecutive measurements, or perhaps the elapsed time from the first to the last, or something like that. While your ISO 8601 string dates are fine for sorting into chronological order, you won't be able to do date differencing with the variables in this form. Unless you know that you will not need to do that, you are better off converting these into numeric date or datetime variables.

      Code:
      //  IF YOU WILL ONLY NEED TO WORK WITH THE DATES
      gen date = daily(substr(time, 1, 10), "YMD")
      assert missing(date) == missing(time)
      format date %tdCCYY-NN-DD
      
      //  OR, IF YOU WILL NEED BOTH THE DATES AND THE HOURS
      gen double date_time = clock(time, "YMDh")
      assert missing(date_time) == missing(time)
      format date_time %tcCCYY-NN-DD_HH

      Comment


      • #4
        Hi Joseph and Clyde, thanks a lot for your help. Clyde: thank you for thinking even further about the steps to convert to datetime variables. Super helpful!

        Comment

        Working...
        X