Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping from wide to long and keeping the label

    Hello everyone,

    I want to reshape my dataset from wide to long format and keep the label information as a new variable.

    This is what my data looks like in long format:
    Code:
    CompanyName                                      _v1                    _v2                   _v3                  _v4
    "GSW Immobilien AG"                                          0       5894403840 -1.88679245283019 5910893287.23354
    "Deutsche Rohstoff AG"                        4.28571428571428       74193506.2  9.80392156862745         71144458
    "Draegerwerk AG & Co KGaA"                   -.134408602150549 1392782583.91451 -.626666666666659 1407410211.36748
    "Tonkens Agrar AG"                            .442477876106205 7521354.29451648                 0 7519657.36964375
    "BHB Brauholding Bayern Mitte AG"            -13.9240506329114          8432000  16.1764705882353          9796000
    "Deutsche Konsum REIT AG"                    -.324675324675329 538890934.198159 -.323624595469249 542916002.194247
    "BRAIN Biotech AG"                            2.40700218818379      185902329.6 -1.72043010752688      181532830.4
    "Senvion SA"                                  10.4972375690608       14623376.6  .555555555555556     13234155.823
    "Lion E Mobility AG"                         -6.63265306122449      36719436.78 -3.20987654320987      39327921.36
    "va Q tec AG"                                -6.35179153094462      376323182.5 -.486223662884934      401847711.4
    "Bitcoin Group SE"                           -8.73180873180874        2.195e+08 -1.83673469387755        2.405e+08
    "Shop Apotheke Europe NV"                    -.846354166666656     2731518928.3 -4.77371357718538     2754834585.6
    "Medios AG"                                  -2.77777777777778        709274685                 0        729539676
    end
    _v1 is return of day one (the label of the variable is the date, here it is 5/11/2021), _v2 is the market cap of the same day (again, the label is 5/11/2021).

    I reshaped the data using the following command:

    Code:
    gen long obs_no = _n
    reshape long _v, i(obs_no) j(_j)
    gen which_var = "return" if mod(_j, 2) == 1, before(_v)
    replace which_var = "market_cap" if missing(which_var)
    gen firm_num = ceil(_j/2)
    drop  _j
    reshape wide _v, i(firm_num obs_no) j(which_var) string
    rename _v* *
    drop obs_no
    unfortunately, I was not able yet to create a new variable with the date.
    This is what my data looks like after reshaping it:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    firm_num  market_cap      return        CompanyName
    1       5894403840                 0 "GSW Immobilien AG"                         
    1       74193506.2  4.28571428571428 "Deutsche Rohstoff AG"                      
    1 1392782583.91451 -.134408602150549 "Draegerwerk AG & Co KGaA"                  
    1 7521354.29451648  .442477876106205 "Tonkens Agrar AG"                          
    1          8432000 -13.9240506329114 "BHB Brauholding Bayern Mitte AG"           
    1 3931445305.56908 -2.66106442577032 "Stroeer SE & Co KGaA"                                                        
    1 11003685830.4026 -1.27868852459017 "Uniper SE"                                 
    1      36719436.78 -6.63265306122449 "Lion E Mobility AG"                        
    1      376323182.5 -6.35179153094462 "va Q tec AG"                               
    1        2.195e+08 -8.73180873180874 "Bitcoin Group SE"                          
    1     2731518928.3 -.846354166666656 "Shop Apotheke Europe NV"                   
    1        709274685 -2.77777777777778 "Medios AG"                                 
    end
    What I would like is a new variable with the label information. Hence, for firm_num=1 it would be 5/11/2021, so the same date as for variable _v1 and _v2.


  • #2
    Here's one way to do it. Note that I don't know the total number of your variables, presumably more than 4, and I've guessed German conventions so daily dates are DMY not MDY.


    Code:
    clear
    input str42 CompanyName                                      _v1                    _v2                   _v3                  _v4
    "GSW Immobilien AG"                                          0       5894403840 -1.88679245283019 5910893287.23354
    "Deutsche Rohstoff AG"                        4.28571428571428       74193506.2  9.80392156862745         71144458
    "Draegerwerk AG & Co KGaA"                   -.134408602150549 1392782583.91451 -.626666666666659 1407410211.36748
    "Tonkens Agrar AG"                            .442477876106205 7521354.29451648                 0 7519657.36964375
    "BHB Brauholding Bayern Mitte AG"            -13.9240506329114          8432000  16.1764705882353          9796000
    "Deutsche Konsum REIT AG"                    -.324675324675329 538890934.198159 -.323624595469249 542916002.194247
    "BRAIN Biotech AG"                            2.40700218818379      185902329.6 -1.72043010752688      181532830.4
    "Senvion SA"                                  10.4972375690608       14623376.6  .555555555555556     13234155.823
    "Lion E Mobility AG"                         -6.63265306122449      36719436.78 -3.20987654320987      39327921.36
    "va Q tec AG"                                -6.35179153094462      376323182.5 -.486223662884934      401847711.4
    "Bitcoin Group SE"                           -8.73180873180874        2.195e+08 -1.83673469387755        2.405e+08
    "Shop Apotheke Europe NV"                    -.846354166666656     2731518928.3 -4.77371357718538     2754834585.6
    "Medios AG"                                  -2.77777777777778        709274685                 0        729539676
    end
    
    label var _v1 "5/11/2019"
    label var _v2 "5/11/2019"
    label var _v3 "6/11/2019"
    label var _v4 "6/11/2019"
    
    local J = 1
    * replace 4 by highest suffix, so 42 if _v1 ... _v42
    forval j = 1/4 {
            if mod(`j', 2) {
                    local retlist `retlist' _v`j'
                    local label`J'  "`: var label _v`j''"
                    local ++J
            }
            else local caplist `caplist' _v`j'
    }
    
    rename (`retlist') (return#), addnumber
    rename (`caplist') (capital#), addnumber
    
    reshape long return capital, i(Company) j(Date)
    
    su Date, meanonly
    
    * replace DMY if dates are really MDY
    forval J = 1/`r(max)' {
        replace Date = daily("`label`J''", "DMY") if Date == `J'
    }
    
    format Date %td
    list, sepby(CompanyName)
    
         +--------------------------------------------------------------------+
         |                     CompanyName        Date      return    capital |
         |--------------------------------------------------------------------|
      1. | BHB Brauholding Bayern Mitte AG   05nov2019   -13.92405    8432000 |
      2. | BHB Brauholding Bayern Mitte AG   06nov2019    16.17647    9796000 |
         |--------------------------------------------------------------------|
      3. |                BRAIN Biotech AG   05nov2019    2.407002   1.86e+08 |
      4. |                BRAIN Biotech AG   06nov2019    -1.72043   1.82e+08 |
         |--------------------------------------------------------------------|
      5. |                Bitcoin Group SE   05nov2019   -8.731809   2.20e+08 |
      6. |                Bitcoin Group SE   06nov2019   -1.836735   2.41e+08 |
         |--------------------------------------------------------------------|
      7. |         Deutsche Konsum REIT AG   05nov2019   -.3246753   5.39e+08 |
      8. |         Deutsche Konsum REIT AG   06nov2019   -.3236246   5.43e+08 |
         |--------------------------------------------------------------------|
      9. |            Deutsche Rohstoff AG   05nov2019    4.285714   7.42e+07 |
     10. |            Deutsche Rohstoff AG   06nov2019    9.803922   7.11e+07 |
         |--------------------------------------------------------------------|
     11. |        Draegerwerk AG & Co KGaA   05nov2019   -.1344086   1.39e+09 |
     12. |        Draegerwerk AG & Co KGaA   06nov2019   -.6266667   1.41e+09 |
         |--------------------------------------------------------------------|
     13. |               GSW Immobilien AG   05nov2019           0   5.89e+09 |
     14. |               GSW Immobilien AG   06nov2019   -1.886792   5.91e+09 |
         |--------------------------------------------------------------------|
     15. |              Lion E Mobility AG   05nov2019   -6.632653   3.67e+07 |
     16. |              Lion E Mobility AG   06nov2019   -3.209877   3.93e+07 |
         |--------------------------------------------------------------------|
     17. |                       Medios AG   05nov2019   -2.777778   7.09e+08 |
     18. |                       Medios AG   06nov2019           0   7.30e+08 |
         |--------------------------------------------------------------------|
     19. |                      Senvion SA   05nov2019    10.49724   1.46e+07 |
     20. |                      Senvion SA   06nov2019    .5555556   1.32e+07 |
         |--------------------------------------------------------------------|
     21. |         Shop Apotheke Europe NV   05nov2019   -.8463542   2.73e+09 |
     22. |         Shop Apotheke Europe NV   06nov2019   -4.773714   2.75e+09 |
         |--------------------------------------------------------------------|
     23. |                Tonkens Agrar AG   05nov2019    .4424779    7521355 |
     24. |                Tonkens Agrar AG   06nov2019           0    7519658 |
         |--------------------------------------------------------------------|
     25. |                     va Q tec AG   05nov2019   -6.351791   3.76e+08 |
     26. |                     va Q tec AG   06nov2019   -.4862237   4.02e+08 |
         +--------------------------------------------------------------------+
    .

    Comment


    • #3
      Hello Nick,

      thank you for your answer.

      I was able to rename the variables, but the reshape command didn't work. I got the following error message:

      Code:
      variable id does not uniquely identify the observations
          Your data are currently wide.  You are performing a reshape long.  You specified i(CompanyName) and
          j(Date).  In the current wide form, variable CompanyName should uniquely identify the observations.
          Remember this picture:
      
               long                                wide
              +---------------+                   +------------------+
              | i   j   a   b |                   | i   a1 a2  b1 b2 |
              |---------------| <--- reshape ---> |------------------|
              | 1   1   1   2 |                   | 1   1   3   2  4 |
              | 1   2   3   4 |                   | 2   5   7   6  8 |
              | 2   1   5   6 |                   +------------------+
              | 2   2   7   8 |
              +---------------+
          Type reshape error for a list of the problem observations.

      Do you have an idea how I can solve that problem?
      I have the variables _v1 - _v1200 (or after renaming them return1-return600 and capital1-capital600). The date is MDY, even though it is a German sample. But I don't mind it.

      Comment


      • #4
        Your questions may be related. Your dates should certainly be calculated as presented as MDY if they are. Otherwise you can see that reshape works for your data example. If it doesn't work for your real dataset, you should check for duplicates first.

        Comment


        • #5
          I checked for duplicates. There weren't duplicates when I used duplicates list or duplicates drop, but I sorted by CompanyName and saw that there are duplicates of some Companies which are from a dataset I appended before.
          Some background information: I have one bigger dataset with German firms and one smaller one (only 16 observations) with firms from the Netherland and Luxemburg.
          In the Netherland/Luxemburg dataset which I used for appending has no duplicates before I use the append command and I am not able to remove them using duplicates drop later after the appending when they appear. I have never experienced this kind of problem before. I think this is why the Stata cant uniquely identify the variable CompanyName for reshaping.

          Comment


          • #6
            I figured it out and the code worked! Thank you so much

            Comment

            Working...
            X