Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping Case-Crossover Data: Long to Wide with One Row Per Woman

    Hi everyone,

    I am stuck with the reshaping command of my dataset. My data has 7284 unique caseids, with 60 observations for each unique caseids (437 040 observations in total) with 135 variables. I am working with a case-crossover dataset where each unique_caseid has one case period (14 days) and three control periods (14 days each, for a total of 60 days per individual). My goal is to reshape the data from long to wide so that each unique_caseid is represented by a single row, with separate columns for each control_lag and day_num.

    Right now, my dataset looks like this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str18 unique_caseid float(control_lag dep_case_control depression anx_case_control anxiety daily_date) double(wbt max_temp)
    "       1   9 01_MZ" 0 1 1 1 1 22843   15.0600004196167 21.829999923706055
    "       1   9 01_MZ" 0 1 1 1 1 22844 16.149999618530273   22.8799991607666
    "       1   9 01_MZ" 0 1 1 1 1 22845 17.049999237060547 24.549999237060547
    "       1   9 01_MZ" 0 1 1 1 1 22846 16.770000457763672   24.8700008392334
    "       1   9 01_MZ" 0 1 1 1 1 22847 16.270000457763672 24.360000610351563
    "       1   9 01_MZ" 0 1 1 1 1 22848  16.31999969482422  23.81999969482422
    "       1   9 01_MZ" 0 1 1 1 1 22849 16.530000686645508  24.18000030517578
    "       1   9 01_MZ" 0 1 1 1 1 22850 16.850000381469727 24.860000610351563
    "       1   9 01_MZ" 0 1 1 1 1 22851 16.829999923706055   23.8700008392334
    "       1   9 01_MZ" 0 1 1 1 1 22852 14.539999961853027 24.139999389648438
    "       1   9 01_MZ" 0 1 1 1 1 22853 15.789999961853027 23.110000610351563
    "       1   9 01_MZ" 0 1 1 1 1 22854 14.380000114440918 23.290000915527344
    "       1   9 01_MZ" 0 1 1 1 1 22855 13.989999771118164 22.739999771118164
    "       1   9 01_MZ" 0 1 1 1 1 22856 15.069999694824219  24.34000015258789
    "       1   9 01_MZ" 0 1 1 1 1 22857  16.15999984741211 25.139999389648438
    "       1   9 01_MZ" 1 0 0 0 0 22836 15.600000381469727 23.459999084472656
    "       1   9 01_MZ" 1 0 0 0 0 22837 15.859999656677246 22.670000076293945
    "       1   9 01_MZ" 1 0 0 0 0 22838 15.920000076293945 23.309999465942383
    "       1   9 01_MZ" 1 0 0 0 0 22839 15.979999542236328 23.719999313354492
    "       1   9 01_MZ" 1 0 0 0 0 22840 15.960000038146973 24.030000686645508
    "       1   9 01_MZ" 1 0 0 0 0 22841  16.18000030517578 24.420000076293945
    "       1   9 01_MZ" 1 0 0 0 0 22842 14.609999656677246 21.959999084472656
    "       1   9 01_MZ" 1 0 0 0 0 22843   15.0600004196167 21.829999923706055
    "       1   9 01_MZ" 1 0 0 0 0 22844 16.149999618530273   22.8799991607666
    "       1   9 01_MZ" 1 0 0 0 0 22845 17.049999237060547 24.549999237060547
    "       1   9 01_MZ" 1 0 0 0 0 22846 16.770000457763672   24.8700008392334
    "       1   9 01_MZ" 1 0 0 0 0 22847 16.270000457763672 24.360000610351563
    "       1   9 01_MZ" 1 0 0 0 0 22848  16.31999969482422  23.81999969482422
    "       1   9 01_MZ" 1 0 0 0 0 22849 16.530000686645508  24.18000030517578
    "       1   9 01_MZ" 1 0 0 0 0 22850 16.850000381469727 24.860000610351563
    "       1   9 01_MZ" 2 0 0 0 0 22829              15.75  23.59000015258789
    "       1   9 01_MZ" 2 0 0 0 0 22830 16.040000915527344 23.700000762939453
    "       1   9 01_MZ" 2 0 0 0 0 22831 15.520000457763672 22.969999313354492
    "       1   9 01_MZ" 2 0 0 0 0 22832 16.020000457763672 23.829999923706055
    "       1   9 01_MZ" 2 0 0 0 0 22833 16.100000381469727  23.90999984741211
    "       1   9 01_MZ" 2 0 0 0 0 22834 15.899999618530273 22.959999084472656
    "       1   9 01_MZ" 2 0 0 0 0 22835   14.9399995803833  23.34000015258789
    "       1   9 01_MZ" 2 0 0 0 0 22836 15.600000381469727 23.459999084472656
    "       1   9 01_MZ" 2 0 0 0 0 22837 15.859999656677246 22.670000076293945
    "       1   9 01_MZ" 2 0 0 0 0 22838 15.920000076293945 23.309999465942383
    "       1   9 01_MZ" 2 0 0 0 0 22839 15.979999542236328 23.719999313354492
    "       1   9 01_MZ" 2 0 0 0 0 22840 15.960000038146973 24.030000686645508
    "       1   9 01_MZ" 2 0 0 0 0 22841  16.18000030517578 24.420000076293945
    "       1   9 01_MZ" 2 0 0 0 0 22842 14.609999656677246 21.959999084472656
    "       1   9 01_MZ" 2 0 0 0 0 22843   15.0600004196167 21.829999923706055
    "       1   9 01_MZ" 3 0 0 0 0 22822 15.880000114440918   22.1200008392334
    "       1   9 01_MZ" 3 0 0 0 0 22823 15.640000343322754  22.31999969482422
    "       1   9 01_MZ" 3 0 0 0 0 22824 15.739999771118164 22.260000228881836
    "       1   9 01_MZ" 3 0 0 0 0 22825              16.75 22.829999923706055
    "       1   9 01_MZ" 3 0 0 0 0 22826 16.780000686645508 22.729999542236328
    "       1   9 01_MZ" 3 0 0 0 0 22827  16.65999984741211  22.40999984741211
    "       1   9 01_MZ" 3 0 0 0 0 22828 16.520000457763672 23.469999313354492
    "       1   9 01_MZ" 3 0 0 0 0 22829              15.75  23.59000015258789
    "       1   9 01_MZ" 3 0 0 0 0 22830 16.040000915527344 23.700000762939453
    "       1   9 01_MZ" 3 0 0 0 0 22831 15.520000457763672 22.969999313354492
    "       1   9 01_MZ" 3 0 0 0 0 22832 16.020000457763672 23.829999923706055
    "       1   9 01_MZ" 3 0 0 0 0 22833 16.100000381469727  23.90999984741211
    "       1   9 01_MZ" 3 0 0 0 0 22834 15.899999618530273 22.959999084472656
    "       1   9 01_MZ" 3 0 0 0 0 22835   14.9399995803833  23.34000015258789
    "       1   9 01_MZ" 3 0 0 0 0 22836 15.600000381469727 23.459999084472656
    "       1  11  4_NP" 0 0 0 0 0 22664                  1  7.929999828338623
    "       1  11  4_NP" 0 0 0 0 0 22665  .7799999713897705   4.28000020980835
    "       1  11  4_NP" 0 0 0 0 0 22666                .25  5.019999980926514
    "       1  11  4_NP" 0 0 0 0 0 22667 -.2199999988079071  5.590000152587891
    "       1  11  4_NP" 0 0 0 0 0 22668                  0   4.71999979019165
    "       1  11  4_NP" 0 0 0 0 0 22669 -.5699999928474426  5.050000190734863
    "       1  11  4_NP" 0 0 0 0 0 22670 -.1000000014901161  5.550000190734863
    "       1  11  4_NP" 0 0 0 0 0 22671 -1.370000004768372   4.53000020980835
    "       1  11  4_NP" 0 0 0 0 0 22672 -2.259999990463257  3.480000019073486
    "       1  11  4_NP" 0 0 0 0 0 22673 -2.009999990463257  6.389999866485596
    "       1  11  4_NP" 0 0 0 0 0 22674 -2.099999904632568  8.369999885559082
    "       1  11  4_NP" 0 0 0 0 0 22675 -.6800000071525574  9.640000343322754
    "       1  11  4_NP" 0 0 0 0 0 22676                  0  7.170000076293945
    "       1  11  4_NP" 0 0 0 0 0 22677  -.699999988079071  4.440000057220459
    "       1  11  4_NP" 0 0 0 0 0 22678 -1.480000019073486  7.880000114440918
    "       1  11  4_NP" 1 0 0 0 0 22657  .4799999892711639  7.099999904632568
    "       1  11  4_NP" 1 0 0 0 0 22658  .1899999976158142  7.380000114440918
    "       1  11  4_NP" 1 0 0 0 0 22659 -.2199999988079071  7.860000133514404
    "       1  11  4_NP" 1 0 0 0 0 22660 -.6899999976158142   7.21999979019165
    "       1  11  4_NP" 1 0 0 0 0 22661 -2.180000066757202  9.960000038146973
    "       1  11  4_NP" 1 0 0 0 0 22662 -1.350000023841858    9.4399995803833
    "       1  11  4_NP" 1 0 0 0 0 22663  .5400000214576721  7.639999866485596
    "       1  11  4_NP" 1 0 0 0 0 22664                  1  7.929999828338623
    "       1  11  4_NP" 1 0 0 0 0 22665  .7799999713897705   4.28000020980835
    "       1  11  4_NP" 1 0 0 0 0 22666                .25  5.019999980926514
    "       1  11  4_NP" 1 0 0 0 0 22667 -.2199999988079071  5.590000152587891
    "       1  11  4_NP" 1 0 0 0 0 22668                  0   4.71999979019165
    "       1  11  4_NP" 1 0 0 0 0 22669 -.5699999928474426  5.050000190734863
    "       1  11  4_NP" 1 0 0 0 0 22670 -.1000000014901161  5.550000190734863
    "       1  11  4_NP" 1 0 0 0 0 22671 -1.370000004768372   4.53000020980835
    "       1  11  4_NP" 2 0 0 0 0 22650  1.559999942779541  9.119999885559082
    "       1  11  4_NP" 2 0 0 0 0 22651  2.720000028610229  8.579999923706055
    "       1  11  4_NP" 2 0 0 0 0 22652  2.700000047683716  9.710000038146973
    "       1  11  4_NP" 2 0 0 0 0 22653  3.380000114440918  10.52000045776367
    "       1  11  4_NP" 2 0 0 0 0 22654  2.529999971389771  10.13000011444092
    "       1  11  4_NP" 2 0 0 0 0 22655  2.559999942779541  8.140000343322754
    "       1  11  4_NP" 2 0 0 0 0 22656   .449999988079071   9.34000015258789
    "       1  11  4_NP" 2 0 0 0 0 22657  .4799999892711639  7.099999904632568
    "       1  11  4_NP" 2 0 0 0 0 22658  .1899999976158142  7.380000114440918
    "       1  11  4_NP" 2 0 0 0 0 22659 -.2199999988079071  7.860000133514404
    end
    format %td daily_date
    What I tried:
    1. [reshape wide depression anxiety dep_case_control anx_case_control interview_date, i(unique_caseid daily_date) j(control_lag)][/Reshaping using control_lag]

    This resulted in number of observations: 437,040 --> 262,224 and number of variables 130 --> 144


    2.[bysort unique_caseid control_lag (daily_date): gen day_num = _n][/Generate sequential day numbers // assigning numbers 1 to 15 within each control_lag]
    [reshape wide depression anxiety dep_case_control anx_case_control, i(unique_caseid control_lag) j(day_num)][/reshaping using day_num instead]

    This resulted in 29,136 observations, with each unique_caseid having 4 rows for the case and three control periods. (Number of variables increased from 136 to 386)

    My questions:

    How can I properly reshape the dataset so that each unique_caseid is represented by one row, with separate columns for each combination of control_lag and day_num?
    Should I include all time-varying variables in the reshape, or is there a better approach?
    Do I need to use a second reshape command following one of those that I had tried?
    And, is it feasible to do this in Stata?

    Any advice would be greatly appreciated!


  • #2
    The "i" variable in your -reshape- command needs to indicate the individual, not the individual and the time period. The "j" variable would in your situation denote the day number, 1, ..., 60. You need to first create the day number and then proceed:
    Code:
    bysort unique_caseid (daily_date): gen byte day_num = _n
    reshape wide control_lag dep_case_control depression anx_case_control anxiety daily_date wbt max_temp, ///
       i(unique_caseid) j(day_num)
    All the time-varying variables should be listed.

    However: While this creates what I think you are asking for, I suspect it's not what you need. Most statistical commands in Stata presume the "long" layout. Can you tell us the particular statistical analysis you want to do with this data? That might lead to a more helpful answer for you, likely one that does not entail this "wide" format.

    Comment


    • #3
      Hi Mike,

      thank for your input! The code you gave worked!

      Regarding your question: The code does create what I am looking for; I now have one row per unique_caseid. I needed my dataset in this format to do conditional logistic regression that also shows the ORs and CIs for each lag. When I tried in the long format, the STATA output did not distinguish between the different lags.
      Also, I will continue with a distributed lag non-linear model (DLNM) now. As far as I understood, I will need my data to be in wide format for both is this, right?

      Comment

      Working...
      X