Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping a dataset

    Dear all,

    I am new to STATA and would really appreciate your help. I have a dataset that looks as follows:

    clear
    begin end output_mwh power_plant
    "01.01.2015 00:00" "01.01.2015 01:15" 318 "Gebersdorf 2"
    "01.01.2015 00:00" "01.01.2015 01:00" 82 "Zolling 5"
    "01.01.2015 00:00" "02.01.2015 00:00" 8300 "Grosskraftwerk Mannheim AG "
    "01.01.2015 00:00" "01.01.2015 01:00" 50 "Heizkraftwerk Heilbronn"
    "01.01.2015 00:00" "02.01.2015 00:00" 13901 "Boxberg, Jänschwalde, Schkopau, Schwarze Pumpe"
    "01.01.2015 03:00" "01.01.2015 04:00" 50 "Heizkraftwerk Altbach/Deizisau "
    "01.01.2015 11:00" "02.01.2015 00:00" 2966 "Zolling 5"
    "01.01.2015 11:00" "01.01.2015 14:00" 150 "Heizkraftwerk Altbach/Deizisau "
    "01.01.2015 13:00" "01.01.2015 16:00" 600 "Jänschwalde, Schwarze Pumpe"
    "01.01.2015 15:15" "02.01.2015 00:00" 1971 "Gebersdorf 2"
    "01.01.2015 17:00" "01.01.2015 19:00" 100 "Heizkraftwerk Altbach/Deizisau "
    end

    In the power_plant variable some observations have more than one observation. I would like to divide my output_mwh equally to the number of observations in the power_plant variable. Therefore, I used the code:

    [split power_plant, p(",")]
    [reshape long power_plant, i(begin end output_mwh redispatch) j(id)]

    My result looks as follows:


    clear
    input str16(begin end) int output_mwh str30 power_plant
    "01.01.2015 00:00" "01.01.2015 01:00" 50 "Heizkraftwerk Heilbronn"
    "01.01.2015 00:00" "01.01.2015 01:00" 50 ""
    "01.01.2015 00:00" "01.01.2015 01:00" 50 ""
    "01.01.2015 00:00" "01.01.2015 01:00" 50 ""
    "01.01.2015 00:00" "01.01.2015 01:00" 82 "Zolling 5"
    "01.01.2015 00:00" "01.01.2015 01:00" 82 ""
    "01.01.2015 00:00" "01.01.2015 01:00" 82 ""
    "01.01.2015 00:00" "01.01.2015 01:00" 82 ""
    "01.01.2015 00:00" "01.01.2015 01:15" 318 "Gebersdorf 2"
    "01.01.2015 00:00" "01.01.2015 01:15" 318 ""
    "01.01.2015 00:00" "01.01.2015 01:15" 318 ""
    "01.01.2015 00:00" "01.01.2015 01:15" 318 ""
    "01.01.2015 00:00" "02.01.2015 00:00" 8300 "Grosskraftwerk Mannheim AG"
    "01.01.2015 00:00" "02.01.2015 00:00" 8300 ""
    "01.01.2015 00:00" "02.01.2015 00:00" 8300 ""
    "01.01.2015 00:00" "02.01.2015 00:00" 8300 ""
    "01.01.2015 00:00" "02.01.2015 00:00" 13901 "Boxberg"
    "01.01.2015 00:00" "02.01.2015 00:00" 13901 " Jänschwalde"
    "01.01.2015 00:00" "02.01.2015 00:00" 13901 " Schkopau"
    "01.01.2015 00:00" "02.01.2015 00:00" 13901 " Schwarze Pumpe"
    "01.01.2015 03:00" "01.01.2015 04:00" 50 "Heizkraftwerk Altbach/Deizisau"
    "01.01.2015 03:00" "01.01.2015 04:00" 50 ""
    "01.01.2015 03:00" "01.01.2015 04:00" 50 ""
    "01.01.2015 03:00" "01.01.2015 04:00" 50 ""
    "01.01.2015 11:00" "01.01.2015 14:00" 150 "Heizkraftwerk Altbach/Deizisau"
    "01.01.2015 11:00" "01.01.2015 14:00" 150 ""
    "01.01.2015 11:00" "01.01.2015 14:00" 150 ""
    "01.01.2015 11:00" "01.01.2015 14:00" 150 ""
    "01.01.2015 11:00" "02.01.2015 00:00" 2966 "Zolling 5"
    "01.01.2015 11:00" "02.01.2015 00:00" 2966 ""
    "01.01.2015 11:00" "02.01.2015 00:00" 2966 ""
    "01.01.2015 11:00" "02.01.2015 00:00" 2966 ""
    "01.01.2015 13:00" "01.01.2015 16:00" 600 "Jänschwalde"
    "01.01.2015 13:00" "01.01.2015 16:00" 600 " Schwarze Pumpe"
    "01.01.2015 13:00" "01.01.2015 16:00" 600 ""
    "01.01.2015 13:00" "01.01.2015 16:00" 600 ""
    "01.01.2015 15:15" "02.01.2015 00:00" 1971 "Gebersdorf 2"
    "01.01.2015 15:15" "02.01.2015 00:00" 1971 ""
    "01.01.2015 15:15" "02.01.2015 00:00" 1971 ""
    "01.01.2015 15:15" "02.01.2015 00:00" 1971 ""
    "01.01.2015 17:00" "01.01.2015 19:00" 100 "Heizkraftwerk Altbach/Deizisau"
    "01.01.2015 17:00" "01.01.2015 19:00" 100 ""
    "01.01.2015 17:00" "01.01.2015 19:00" 100 ""
    "01.01.2015 17:00" "01.01.2015 19:00" 100 ""
    "01.01.2015 18:00" "02.01.2015 00:00" 930 "Moorburg"
    "01.01.2015 18:00" "02.01.2015 00:00" 930 ""
    "01.01.2015 18:00" "02.01.2015 00:00" 930 ""
    "01.01.2015 18:00" "02.01.2015 00:00" 930 ""
    "01.01.2015 20:00" "01.01.2015 21:00" 175 "Gebersdorf 2"
    "01.01.2015 20:00" "01.01.2015 21:00" 175 ""
    "01.01.2015 20:00" "01.01.2015 21:00" 175 ""
    "01.01.2015 20:00" "01.01.2015 21:00" 175 ""
    "01.01.2015 20:00" "02.01.2015 00:00" 670 "Boxberg"
    "01.01.2015 20:00" "02.01.2015 00:00" 670 ""
    "01.01.2015 20:00" "02.01.2015 00:00" 670 ""
    "01.01.2015 20:00" "02.01.2015 00:00" 670 ""
    "01.01.2015 21:00" "02.01.2015 00:00" 150 "Heizkraftwerk Altbach/Deizisau"
    "01.01.2015 21:00" "02.01.2015 00:00" 150 ""
    "01.01.2015 21:00" "02.01.2015 00:00" 150 ""
    "01.01.2015 21:00" "02.01.2015 00:00" 150 ""
    "02.01.2015 00:00" "02.01.2015 03:00" 450 "Grosskraftwerk Mannheim AG"
    "02.01.2015 00:00" "02.01.2015 03:00" 450 ""
    "02.01.2015 00:00" "02.01.2015 03:00" 450 ""
    "02.01.2015 00:00" "02.01.2015 03:00" 450 ""
    "02.01.2015 00:00" "02.01.2015 07:00" 1015 "Isar 2 (SWM)"
    "02.01.2015 00:00" "02.01.2015 07:00" 1015 ""
    "02.01.2015 00:00" "02.01.2015 07:00" 1015 ""
    "02.01.2015 00:00" "02.01.2015 07:00" 1015 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 3782 "Gebersdorf 2"
    "02.01.2015 00:00" "03.01.2015 00:00" 3782 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 3782 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 3782 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 5350 "Heizkraftwerk Altbach/Deizisau"
    "02.01.2015 00:00" "03.01.2015 00:00" 5350 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 5350 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 5350 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 5682 "Zolling 5"
    "02.01.2015 00:00" "03.01.2015 00:00" 5682 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 5682 ""
    "02.01.2015 00:00" "03.01.2015 00:00" 5682 ""
    end

    The power_plant observations that contains more than one observation are now equally divided as it was my goal. However, for the other observations I have now blank spaces which I want to combine again, but I am unsure how to proceed. I would appreciate your help.

    Thank you very much.

    Best regards,
    Bianca

  • #2
    Here is a (more efficient) way that does not involve splitting and reshaping the data.

    Code:
    clear
    input strL(begin end output_mwh power_plant)
    "01.01.2015 00:00" "01.01.2015 01:15" 318 "Gebersdorf 2"
    "01.01.2015 00:00" "01.01.2015 01:00" 82 "Zolling 5"
    "01.01.2015 00:00" "02.01.2015 00:00" 8300 "Grosskraftwerk Mannheim AG "
    "01.01.2015 00:00" "01.01.2015 01:00" 50 "Heizkraftwerk Heilbronn"
    "01.01.2015 00:00" "02.01.2015 00:00" 13901 "Boxberg, Jänschwalde, Schkopau, Schwarze Pumpe"
    "01.01.2015 03:00" "01.01.2015 04:00" 50 "Heizkraftwerk Altbach/Deizisau "
    "01.01.2015 11:00" "02.01.2015 00:00" 2966 "Zolling 5"
    "01.01.2015 11:00" "01.01.2015 14:00" 150 "Heizkraftwerk Altbach/Deizisau "
    "01.01.2015 13:00" "01.01.2015 16:00" 600 "Jänschwalde, Schwarze Pumpe"
    "01.01.2015 15:15" "02.01.2015 00:00" 1971 "Gebersdorf 2"
    "01.01.2015 17:00" "01.01.2015 19:00" 100 "Heizkraftwerk Altbach/Deizisau "
    end
    
    gen long group_id=_n, before(begin)
    gen howmany= length(power_plant)- length(subinstr(power_plant, ",", "", .)) +1
    expand howmany
    bys group: replace power_plant = word(subinstr(subinstr(trim(itrim(power_plant)), " ", "_", .), ",", " ", .) , _n)
    replace power_plant= trim(itrim(subinstr(power_plant, "_", " ", .)))
    Res.:

    Code:
    . l, sepby(group_id)
    
         +------------------------------------------------------------------------------------------------------+
         | group_id              begin                end   output~h                      power_plant   howmany |
         |------------------------------------------------------------------------------------------------------|
      1. |        1   01.01.2015 00:00   01.01.2015 01:15        318                     Gebersdorf 2         1 |
         |------------------------------------------------------------------------------------------------------|
      2. |        2   01.01.2015 00:00   01.01.2015 01:00         82                        Zolling 5         1 |
         |------------------------------------------------------------------------------------------------------|
      3. |        3   01.01.2015 00:00   02.01.2015 00:00       8300       Grosskraftwerk Mannheim AG         1 |
         |------------------------------------------------------------------------------------------------------|
      4. |        4   01.01.2015 00:00   01.01.2015 01:00         50          Heizkraftwerk Heilbronn         1 |
         |------------------------------------------------------------------------------------------------------|
      5. |        5   01.01.2015 00:00   02.01.2015 00:00      13901                          Boxberg         4 |
      6. |        5   01.01.2015 00:00   02.01.2015 00:00      13901                      Jänschwalde         4 |
      7. |        5   01.01.2015 00:00   02.01.2015 00:00      13901                         Schkopau         4 |
      8. |        5   01.01.2015 00:00   02.01.2015 00:00      13901                   Schwarze Pumpe         4 |
         |------------------------------------------------------------------------------------------------------|
      9. |        6   01.01.2015 03:00   01.01.2015 04:00         50   Heizkraftwerk Altbach/Deizisau         1 |
         |------------------------------------------------------------------------------------------------------|
     10. |        7   01.01.2015 11:00   02.01.2015 00:00       2966                        Zolling 5         1 |
         |------------------------------------------------------------------------------------------------------|
     11. |        8   01.01.2015 11:00   01.01.2015 14:00        150   Heizkraftwerk Altbach/Deizisau         1 |
         |------------------------------------------------------------------------------------------------------|
     12. |        9   01.01.2015 13:00   01.01.2015 16:00        600                      Jänschwalde         2 |
     13. |        9   01.01.2015 13:00   01.01.2015 16:00        600                   Schwarze Pumpe         2 |
         |------------------------------------------------------------------------------------------------------|
     14. |       10   01.01.2015 15:15   02.01.2015 00:00       1971                     Gebersdorf 2         1 |
         |------------------------------------------------------------------------------------------------------|
     15. |       11   01.01.2015 17:00   01.01.2015 19:00        100   Heizkraftwerk Altbach/Deizisau         1 |
         +------------------------------------------------------------------------------------------------------+
    Otherwise, it appears that you simply want:

    Code:
    drop if missing(power_plant)
    in #1. But the lack of a group identifier in your code may lead to difficulty identifying groups if two or more groups have the same observations on the 4 variables "begin", "end", "output" and "power_plant".
    Last edited by Andrew Musau; 17 Jan 2024, 09:57.

    Comment


    • #3
      Dear Andrew,

      yes that worked. Your code just did not divide the output_mwh variable equally to the number of power_plants. However, this problem I could easily solve with the command [generate powerr_plant_new=output_mwh/howmany]
      So now everything looks fine. Thank you so much

      Comment

      Working...
      X