Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loop-command for time variables

    Dear forum users,

    I am using a large RedCap dataset. RedCap automatically generates a STATA do file to help import all variable types into STATA.

    However, while this generic STATA do file contains commands to “destring” all date variables, this does not apply to time variables. In other words all time variables are still in string format and not yet appropriate for analysis. In order to avoid typing hundreds of single commands manually I was wondering whether somebody knows a useful loop-command to automatically “destring” all time-variables in the dataset.

    Best wishes,
    Hannes

  • #2
    Could you show us what your real data look like by using dataex? Let's have a look at 5 such variables, or however many you'd like.

    Comment


    • #3
      What Jared Greathouse said, plus how are your date variables to be recognised as such? Do you they have distinctive variable names, or variable labels, or are you relying on their content being diagnostic?

      Comment


      • #4
        Dear Jared, dear Nick,

        These are a few of such time variables. They are in the format hh:mm. Please note that it is a long dataset and that there are variables for which no data is available at all time points.
        vs_bp_stop te_temp_time ch_chem_time hem_time ur_time fi_smears_time gen_dbs_time
        09:37 09:38 10:01 10:01 11:13 10:01 10:01
        15:10 15:14
        16:55
        18:48
        20:25
        23:25 23:26 23:26
        03:19 03:06 03:26 03:26
        15:28 15:30 15:01 15:01
        03:10 03:12 03:20 03:20
        15:16 15:18 15:20 15:20 15:26 15:20 15:20
        03:20 03:29 03:26 03:26
        15:40 15:27 15:30 15:30 16:17 15:30 15:30
        11:01 11:05 11:13 11:13 11:20 11:13 11:13
        08:40 08:44 08:49 08:49 08:57 08:49 08:49
        17:20 17:23 17:27 17:27
        11:00 11:03 11:12 11:12 11:21 11:12 11:12
        11:35 11:37 11:37
        10:44 10:44 10:48 10:48



        Concerning Nick's questions: About 75% of all variables have the suffix "_time" in the variable name. Virtually all variable labels have a mention of "Time" in them.

        Best wishes and thanks for your help!
        Hannes

        Comment


        • #5
          For us to be able to work with this, especially since you're asking about date/time variables which can be a notoriously difficult subject even for veteran Stata users, you'll need to use the dataex command to show your data to us, as I asked in my first post.

          Comment


          • #6
            You can loop over all time variables that have time in the variable name or variable label. For example, suppose you want a numeric variable with units minutes. You could go

            Code:
            lookfor time 
            local myvars `r(varlist)' 
            
            gen where_is_colon = . 
            foreach v of local myvars { 
                   replace where_is_colon = strpos(`v', ":") 
                   gen n_`v' = 60 * real(substr(`v', 1, where_is_colon - 1)) + real(substr(`v', where_is_colon + 1, .)) 
            } 
            
            drop where_is_colon
            Now you could set aside those time variables and work on the rest.

            Code:
            lookfor time 
            ds `r(varlist)', not 
            
            local othervars `r(varlist)'
            That won't solve all your problems, necessarily.

            By the way, you mentioned destringing dates. That is usually a bad idea. https://www.stata-journal.com/articl...article=dm0098 explains.

            Comment


            • #7
              Originally posted by Jared Greathouse View Post
              For us to be able to work with this, especially since you're asking about date/time variables which can be a notoriously difficult subject even for veteran Stata users, you'll need to use the dataex command to show your data to us, as I asked in my first post.

              Dear Jared, please find dataex code below:


              [CODE]
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str5(ecg_time vs_vitals_time vs_bp_start vs_bp_stop te_temp_time ch_chem_time hem_time)
              "" "09:37" "09:13" "09:37" "09:38" "10:01" "10:01"
              "15:10" "15:10" "15:00" "15:10" "15:14" "" ""
              "" "" "" "" "" "" ""
              "" "" "" "" "16:55" "" ""
              "" "" "" "" "18:48" "" ""
              "" "" "" "" "20:25" "" ""
              "" "" "" "" "23:25" "" ""
              "" "03:19" "03:06" "03:19" "03:06" "" ""
              "14:49" "15:18" "15:18" "15:28" "15:30" "" ""
              "" "03:11" "03:01" "03:10" "03:12" "" ""
              "14:42" "15:06" "15:06" "15:16" "15:18" "15:20" "15:20"
              "" "03:10" "03:10" "03:20" "03:29" "" ""
              "14:55" "15:26" "15:30" "15:40" "15:27" "15:30" "15:30"
              "11:33" "10:51" "10:51" "11:01" "11:05" "11:13" "11:13"
              "" "08:30" "08:30" "08:40" "08:44" "08:49" "08:49"
              "" "17:10" "17:10" "17:20" "17:23" "" ""
              "" "10:50" "10:50" "11:00" "11:03" "11:12" "11:12"


              @Nick, thanks for your help!

              Comment

              Working...
              X