Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Reshape

    Hello:
    I have the following dataset in long format:
    Code:
    input double(id datetime) str32 measurement double value
    1 1906654020000 "height" 1800
    1 1906654020000 "weight"   50
    1 1906654020000 "length"   90
    1 1906654020000 "width"    42
    1 1.9066542e+12 "height" 1900
    1 1.9066542e+12 "weight"   50
    1 1.9066542e+12 "length"   92
    1 1.9066542e+12 "width"    40
    1 1.9066542e+12 "depth"   880
    1 1.9066545e+12 "height" 2000
    1 1.9066545e+12 "weight"   54
    1 1.9066548e+12 "height" 1900
    1 1.9066548e+12 "width"    41
    1 1.9066548e+12 "depth"   860
    2 1.9092456e+12 "height" 1500
    2 1.9092456e+12 "weight"   55
    2 1.9092456e+12 "length"   92
    2 1.9092456e+12 "width"    45
    2 1.9092456e+12 "depth"   800
    2 1.9092459e+12 "height" 1800
    2 1.9092459e+12 "weight"   58
    2 1.9092459e+12 "width"    40
    2 1.9092462e+12 "length"   94
    2 1.9092462e+12 "width"    42
    3 1.9066542e+12 "height" 2010
    3 1.9066542e+12 "weight"   55
    3 1.9066542e+12 "length"   98
    3 1.9066542e+12 "width"    48
    3 1.9066542e+12 "depth"   820
    3 1906654320000 "height" 1900
    3 1906654320000 "weight"   52
    3 1906654320000 "depth"   850
    4  1.943424e+12 "height" 1800
    4  1.943424e+12 "weight"   50
    4  1.943424e+12 "length"   99
    4  1.943424e+12 "width"    42
    4 1.9434243e+12 "height" 1700
    4 1.9434243e+12 "weight"   57
    4 1.9434243e+12 "length"   94
    4 1.9434243e+12 "width"    44
    4 1.9434243e+12 "depth"   870
    end
    format %tcCCYY-NN-DD_HH:MM datetime
    I would like to reshape into wide but am having trouble knowing how to define the i,j, and base names to make this work.
    What I would like it to look like eventually (using id 1 as an example) is where I would have the dataset organized by id and with each measurement and corresponding value by datetime:
    id datetime height weight length width depth
    1 2020-06-01 18:07 1800 50 90 42 .
    1 2020-06-01 18:10 1900 50 92 40 880
    1 2020-06-01 18:15 2000 54 . . .
    1 2020-06-01 18:20 1900 . . 41 860
    Thank you for assistance.


  • #2
    What code did you try?

    Comment


    • #3
      Perhaps this will start you in a useful direction
      Code:
      reshape wide value, i(id datetime) j(measurement) string
      rename (value*) (*)
      Code:
      . reshape wide value, i(id datetime) j(measurement) string
      (j = depth height length weight width)
      
      Data                               Long   ->   Wide
      -----------------------------------------------------------------------------
      Number of observations               41   ->   11          
      Number of variables                   4   ->   7           
      j variable (5 values)       measurement   ->   (dropped)
      xij variables:
                                        value   ->   valuedepth valueheight ... valuewidth
      -----------------------------------------------------------------------------
      
      . rename (value*) (*)
      
      . list, sep(0)
      
           +------------------------------------------------------------------+
           | id           datetime   depth   height   length   weight   width |
           |------------------------------------------------------------------|
        1. |  1   2020-06-01 18:07       .     1800       90       50      42 |
        2. |  1   2020-06-01 18:10     880     1900       92       50      40 |
        3. |  1   2020-06-01 18:15       .     2000        .       54       . |
        4. |  1   2020-06-01 18:20     860     1900        .        .      41 |
        5. |  2   2020-07-01 18:00     800     1500       92       55      45 |
        6. |  2   2020-07-01 18:05       .     1800        .       58      40 |
        7. |  2   2020-07-01 18:10       .        .       94        .      42 |
        8. |  3   2020-06-01 18:10     820     2010       98       55      48 |
        9. |  3   2020-06-01 18:12     850     1900        .       52       . |
       10. |  4   2021-08-01 08:00       .     1800       99       50      42 |
       11. |  4   2021-08-01 08:05     870     1700       94       57      44 |
           +------------------------------------------------------------------+
      The two key points are (a) use the string option on the reshape command to use measurement as the suffix (j-value) on the newly-created variables and (b) use the rename command features described in help rename group to strip off the "value" from the beginning of the newly-created variable names.

      Comment


      • #4
        Ah, thank you William Lisowski! I believe this is working!
        Jared Greathouse and William, I was trying to use the reshape command in this manner as I felt I was so close to visualizing how the conversion would be, but could not iron out exactly which should be the i variables or the j and how to name the stub. In fact, I believe I may have fell upon this earlier, except I kept getting error that indicated: "values of variable event_name not unique within fin datetime".
        I realized that I had both some duplicate lines in the long dataset, in addition to some duplicate id, datetime, and measurement observations that aligned with differing values--likely recording errors in the data.
        I was able to clean these and your code now works to successfully convert to wide.
        The other issue is that my string names for the measurements in the longer dataset have spaces, which aren't allowed and yielded an error. I rid of the spaces and it works too.
        thanks!

        Comment

        Working...
        X