Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape wide to long: many variables of many years

    Hi All,

    I have a dataset with more than 10 variables of more than 10 years in wide shape. I'm trying to reshape the dataset from wide to long, with the new variable (j) being Year, and other columns being the variable names, which are Apple, Orange, Cherry, ..., Dog, Cat, Penguin, ...I tried it with the following code:
    Code:
    reshape long Y, i(CountryCode) j(Year) string
    , which does not work. It changes the dataset to completely long-shaped, with year-variable pairs in one column instead of putting the different variable names as different columns.

    I'm wondering how can I reshape only the "Year" part of this dataset from wide to long, and let the variable names remain in different columns?
    The columns names that I wish should look like
    Code:
    CountryCode Country Year Apple Orange Cherry ... Dog Cat Penguin ...
    Below shows the data I use and the results I got:

    The data I use:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3 CountryCode str22 Country byte(Y2000Apple Y2001Apple Y2002Apple Y2003Apple Y2004Apple Y2005Apple Y2006Apple Y2007Apple Y2000Dog Y2001Dog Y2002Dog Y2003Dog Y2004Dog Y2005Dog Y2006Dog Y2007Dog)
    "AFG" "Afghanistan"            55 24 47 32 54 21 32 33 52 11 54  2 28 47 26 48
    "ALB" "Albania"                48 14 60 40 33 15 34 10  8 38 12 46 42 55 24 44
    "DZA" "Algeria"                23 63 59 65 62 19 39 30 36 36 16 60 53 36 17 39
    "AGO" "Angola"                 12 59 34 23 59 53 38 10 30 57 59 48 41 50 57 38
    "ARG" "Argentina"              65 54 64 36 39 26 17 18 51 36 10 40 12 37 49 21
    "ARM" "Armenia"                17 34 54 44 60  1  1 59 66 60 17 16 34  4 41 45
    "AUS" "Australia"              44 16  9 32 38 34 16 39  5 44 27 60 19 44  9 25
    "AUT" "Austria"                57 11 27 28 23 17  3 33 28 59  1 28  7  5 30 57
    "AZE" "Azerbaijan"             66 36  5 29 34  1 24 57 11 16 53 44 32 15 17 12
    "BGD" "Bangladesh"             21 34 47 47 60  5 60 62 19 20 48  3 64  7 35  2
    "BLR" "Belarus"                30 61 25 51 52 15 57 10 28 51  4 26 42 18 38  2
    "BEL" "Belgium"                 9 43 27 53  6 66 59 32 15  4 42 42 55 59 12 65
    "BOL" "Bolivia"                 5 28 62 54 25 41 32 25 24 35  3  4 14 54 26 11
    "BIH" "Bosnia and Herzegovina"  9 39  6 48 60 64 31 55  3 17 10 25 51 44 26 25
    "BWA" "Botswana"               60 26 65 50 38 47  5 31 24 55 50 31 63 45 42 43
    "BRA" "Brazil"                 25  3 32 59 46 55 51 13  5 14  8 30  1 55 47 60
    "BGR" "Bulgaria"               42 10 24 17 37 34 26 30 25 32 39 46  8 31 29 60
    "BFA" "Burkina Faso"           41  6 18  2 29  6 63 53  7 20 57 59 66 50 36  2
    "KHM" "Cambodia"               33 39 51 58 65 54  5 40 22 24 15 51 34 13 66 35
    "CMR" "Cameroon"               18 19 12 60 14  9 61  2 53 20 38 34  6 17 31 43
    "CAN" "Canada"                 13 55 44 49  2 52 65  5 43 30 31  2 66 48 42 49
    "CHL" "Chile"                  14 12 48 34 61 44 17 49 28 57 36 47  6 62 61 63
    "CHN" "China"                  59 32 47  6 60 65 31 62 59 40 35 42 16 36 13 58
    "HKG" "China, Hong Kong"       24 22  2 15 37 13 28 64 48 22 58  6 32 55 10  8
    "TWN" "Chinese Taipei"         41 54 31 10 66 36  1 27 48 25 19 60  8 54  5 47
    "COL" "Colombia"                6 60 22 56 60 41 19  2 36 59 16 33 26 64 22 15
    "COD" "Congo, Dem. Rep."       45 18 51 63 24 10 26 27 24  8 57 31 16 55  5 43
    "COG" "Congo, Rep."            30 39 46 63  4 32 19 51 24 20 43  6 50 44 22 11
    "CRI" "Costa Rica"              2  3 65 15 13  5  9 61  8 46 60 47  6 27 39 17
    "HRV" "Croatia"                52 60 30 42 19 57 11 15 32 49 44 66 35 41 46 47
    "CUB" "Cuba"                    3 12 55 20 13  1 23 64 38 24 54  9  4 43 35 64
    "CYP" "Cyprus"                 34 12 15 38 11 31 46  3 37 63 66 30 63 46 13 24
    "CZE" "Czech Republic"         62 16 48 26 62 59 50 11 10 48 33 51 28 30  2 55
    "DNK" "Denmark"                60 20 43 63 16 40 58 40 32 47 54 26 25 27 66  4
    "ECU" "Ecuador"                27 58 26 59 20 11 50 62 17 57 49 48 19 50  7 54
    "EGY" "Egypt, Arab Rep."       17 40 54 22  3  8 39 42 66  5 63 43 60 36 44 28
    "EST" "Estonia"                40 15  4  8  8 52 42 61 38 58 36 24 14 14 24 16
    "SWZ" "Eswatini"               44 46 24 31 58 22 20 50  5 37 62 57 36 40  1  1
    "ETH" "Ethiopia"               64 60 38 49 43 66  6 17 44 51 20  1 25 43 26 62
    "FIN" "Finland"                34 18  2 31 11 50 55 19 58 58 54 23 23 39 27 34
    "FRA" "France"                 66 65  6 64 31 33  4 46 12 66 20 41 41 42  9 11
    "GAB" "Gabon"                  44 64  3 28 32  3 57 54  4 33 35 19 27 49  9 13
    "GMB" "Gambia, The"            60 23 41 38 33 61 20 46 13 22 52 26 66 57 61 54
    "GEO" "Georgia"                58 53 50  9 42 61 10 32 34  1 16 31  7 43 34 65
    "DEU" "Germany"                66  3 28 39 41 19 33 14 11 52  2 52 16 21 23 63
    "GHA" "Ghana"                  19 62 13  3 36 59 13 19 61 21 41 59  1 58  1 11
    "GRC" "Greece"                  3 26  5 25 29 49 29 28 43 10 40 22 28 38 48 44
    "HUN" "Hungary"                53  8  8 33 19 62 59 18 49 10 57  7 14  9 35 64
    "ISL" "Iceland"                38 38 38 59 29 26 60 42 58 17 55 20 25  8 50 63
    "IND" "India"                  56 49 38 50 39 51 25 36  6 33 27 38  1 49 47 29
    "IDN" "Indonesia"              61 43 49 43 66 39 60 57 37 14 59 11 41 66 24  4
    "IRN" "Iran, Islamic Rep."     39 32 14 47 62 60 35  2 34 27 12 36 42 65 50 37
    "IRQ" "Iraq"                   41 28 65 23 29 44 44 61 30 17  2 27 41 14 39 25
    "IRL" "Ireland"                40 10 41 23 58  2  2  2 53 45 27 57 52 56 38  8
    "ISR" "Israel"                 38 26 65 23  5 63 25 60 48 58 27 15  3 21  2 54
    "ITA" "Italy"                  13  4 21 27 59 22 27 56 19 23 21 12  7 52 52  4
    "JPN" "Japan"                  29  6 27 56 10  9 65 49  3 25 40 60 51  8 32 46
    "JOR" "Jordan"                 10 39 53 18 52 13  9 36 10 27 43 50 14 64 28 57
    "KAZ" "Kazakhstan"             26 33 30 37 21 41  7 16 43 28 15 23  4 20 59 32
    "KEN" "Kenya"                  26  4 29 28 28 48  1 44 38 26 27 24  1 26 43 42
    "KOR" "Korea, Rep."             2 16 63 21 28  2  7 43 51 21 13 35 21 16  9 31
    "XKX" "Kosovo"                 41 60 47 26  6 29  5  7 32 50  9 26 29 52 21 62
    "KGZ" "Kyrgyz Republic"        21 22 66  8  4 53 24  7 15 57 46 55 34 36 27 24
    "LAO" "Lao PDR"                35 34 54 27 43 36 59 52 47 24 18 31 49 34 54  3
    "LVA" "Latvia"                 45 11 52 26 50 47 44  3 43 39 42 34 19 39 17 23
    "LIE" "Liechtenstein"           6 20 35 52 49 63 54  7 46 54  1 31 25 19 34  8
    "LTU" "Lithuania"              57 63 13 61  2 12 39 61 64 37 29 49 25 52 48 54
    "LUX" "Luxembourg"             64 33 16 40 54 64 55  4  2 45  6  9 17 46  6 59
    "MDG" "Madagascar"              7 30 36 27 56 62 58 26 25  4 52 63 53 39 21  8
    "MWI" "Malawi"                 32 49 65 63 49 58 12 16 48 23 21 63 62 29 65 20
    "MYS" "Malaysia"               10 54  7 17 30  1 21 28 46 20 30  5 48 60 55 59
    "MLI" "Mali"                   19 63 15 55 23 62 22 54 43  2 30 31 35 43 49  2
    "MLT" "Malta"                  14 58 31 37 54 63 17 65 59 36 62 34 24 57 50 61
    "MRT" "Mauritania"             15  9 43 19 21 63 56 53 61  6 66 58 20 43  9 47
    "MEX" "Mexico"                 30 40 49 55 44 59 29 54 61 15 56 65 51 46  5 50
    "MDA" "Moldova"                 5  9 37  3 34 14 36 57  2 24 50 41 22 36 63 26
    "MCO" "Monaco"                 53  7 53 15 11 44 19 15 18 40  5 44  7 32  3 48
    "MNG" "Mongolia"                5 28 18 28 50 11 14 55 17 62 42 13 35 66 11 64
    "MNE" "Montenegro"              3 15  4 32 14 45 51 32 55  2 66 26 44 24 48 19
    "MAR" "Morocco"                12 40 47 37 13 41 30 15 65  2 63  6  1 48  3  9
    "MOZ" "Mozambique"              3 46 36 57 50  4 13 23 26 31 24 43 41 49 52 54
    "MMR" "Myanmar"                19 27 28 57 23 54 57  5 55 25 23 58  8 14 35 15
    "NAM" "Namibia"                17  8 60  8 36 63 33 16 26 62 59  3 60 53 55 50
    "NLD" "Netherlands"            47 29 56 19 59 48  4 38 34 37 33  6 13  5 56 27
    "NZL" "New Zealand"            42 57 43 34 17 34 65  8 27 41 46 24  8 47 54  2
    "NIC" "Nicaragua"              42 31 45 62 26 40 24 21 29 43 47  7 57 59 43 62
    "NGA" "Nigeria"                66 41 60 24 17 64  5 10 49 29 12 64 13 47  3  2
    "MKD" "North Macedonia"        16 43 44 63  2 45 12 14 41 28 24 62 46 38 58 27
    "NOR" "Norway"                 56 35 19 13  4 43  3 42 30 65  8 12  7 48 16 29
    "PAK" "Pakistan"               60 57 44 58 23 45 65 39 46 50 63 64  3 57 26 61
    "PRY" "Paraguay"               59 20 65 54 31  5 31  3 17 58 26 49 27 21 57 53
    "PER" "Peru"                   54 12 55 57 29  8 18  6 48 38 31 20 51 46 20 52
    "POL" "Poland"                 64 45 17 23 20 52 36  3 38 51 53 20 43 28  8 25
    "PRT" "Portugal"               45 23 45 59 32 31  9 16 32 55 52  1 19 35 29 12
    "PRI" "Puerto Rico"             6 35 28 62  5 29 21 17 59 45 20 46 42 66 12 55
    "ROU" "Romania"                25 26 47 55 12 66 31 15  8  4 35 62 53  4 11 31
    "RUS" "Russian Federation"     48 17 10 46 53 65 29 32 52 35 46  3 25 16 37 56
    "SAU" "Saudi Arabia"           11 45  9  8 41 17 54 60 60 54 23 33 16 14 35 34
    "SEN" "Senegal"                 8 62 63  9 19  2 58 23 44 52 43 45 13 30 24 40
    "SRB" "Serbia"                 57 37  1 61 46 50 48 52 45 20 29 55 20 62 33 40
    end
    The results I got:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3 CountryCode str9 Year str22 Country byte Y
    "AFG" "2000Apple" "Afghanistan" 55
    "AFG" "2000Dog"   "Afghanistan" 52
    "AFG" "2001Apple" "Afghanistan" 24
    "AFG" "2001Dog"   "Afghanistan" 11
    "AFG" "2002Apple" "Afghanistan" 47
    "AFG" "2002Dog"   "Afghanistan" 54
    "AFG" "2003Apple" "Afghanistan" 32
    "AFG" "2003Dog"   "Afghanistan"  2
    "AFG" "2004Apple" "Afghanistan" 54
    "AFG" "2004Dog"   "Afghanistan" 28
    "AFG" "2005Apple" "Afghanistan" 21
    "AFG" "2005Dog"   "Afghanistan" 47
    "AFG" "2006Apple" "Afghanistan" 32
    "AFG" "2006Dog"   "Afghanistan" 26
    "AFG" "2007Apple" "Afghanistan" 33
    "AFG" "2007Dog"   "Afghanistan" 48
    "AGO" "2000Apple" "Angola"      12
    "AGO" "2000Dog"   "Angola"      30
    "AGO" "2001Apple" "Angola"      59
    "AGO" "2001Dog"   "Angola"      57
    "AGO" "2002Apple" "Angola"      34
    "AGO" "2002Dog"   "Angola"      59
    "AGO" "2003Apple" "Angola"      23
    "AGO" "2003Dog"   "Angola"      48
    "AGO" "2004Apple" "Angola"      59
    "AGO" "2004Dog"   "Angola"      41
    "AGO" "2005Apple" "Angola"      53
    "AGO" "2005Dog"   "Angola"      50
    "AGO" "2006Apple" "Angola"      38
    "AGO" "2006Dog"   "Angola"      57
    "AGO" "2007Apple" "Angola"      10
    "AGO" "2007Dog"   "Angola"      38
    "ALB" "2000Apple" "Albania"     48
    "ALB" "2000Dog"   "Albania"      8
    "ALB" "2001Apple" "Albania"     14
    "ALB" "2001Dog"   "Albania"     38
    "ALB" "2002Apple" "Albania"     60
    "ALB" "2002Dog"   "Albania"     12
    "ALB" "2003Apple" "Albania"     40
    "ALB" "2003Dog"   "Albania"     46
    "ALB" "2004Apple" "Albania"     33
    "ALB" "2004Dog"   "Albania"     42
    "ALB" "2005Apple" "Albania"     15
    "ALB" "2005Dog"   "Albania"     55
    "ALB" "2006Apple" "Albania"     34
    "ALB" "2006Dog"   "Albania"     24
    "ALB" "2007Apple" "Albania"     10
    "ALB" "2007Dog"   "Albania"     44
    "ARG" "2000Apple" "Argentina"   65
    "ARG" "2000Dog"   "Argentina"   51
    "ARG" "2001Apple" "Argentina"   54
    "ARG" "2001Dog"   "Argentina"   36
    "ARG" "2002Apple" "Argentina"   64
    "ARG" "2002Dog"   "Argentina"   10
    "ARG" "2003Apple" "Argentina"   36
    "ARG" "2003Dog"   "Argentina"   40
    "ARG" "2004Apple" "Argentina"   39
    "ARG" "2004Dog"   "Argentina"   12
    "ARG" "2005Apple" "Argentina"   26
    "ARG" "2005Dog"   "Argentina"   37
    "ARG" "2006Apple" "Argentina"   17
    "ARG" "2006Dog"   "Argentina"   49
    "ARG" "2007Apple" "Argentina"   18
    "ARG" "2007Dog"   "Argentina"   21
    "ARM" "2000Apple" "Armenia"     17
    "ARM" "2000Dog"   "Armenia"     66
    "ARM" "2001Apple" "Armenia"     34
    "ARM" "2001Dog"   "Armenia"     60
    "ARM" "2002Apple" "Armenia"     54
    "ARM" "2002Dog"   "Armenia"     17
    "ARM" "2003Apple" "Armenia"     44
    "ARM" "2003Dog"   "Armenia"     16
    "ARM" "2004Apple" "Armenia"     60
    "ARM" "2004Dog"   "Armenia"     34
    "ARM" "2005Apple" "Armenia"      1
    "ARM" "2005Dog"   "Armenia"      4
    "ARM" "2006Apple" "Armenia"      1
    "ARM" "2006Dog"   "Armenia"     41
    "ARM" "2007Apple" "Armenia"     59
    "ARM" "2007Dog"   "Armenia"     45
    "AUS" "2000Apple" "Australia"   44
    "AUS" "2000Dog"   "Australia"    5
    "AUS" "2001Apple" "Australia"   16
    "AUS" "2001Dog"   "Australia"   44
    "AUS" "2002Apple" "Australia"    9
    "AUS" "2002Dog"   "Australia"   27
    "AUS" "2003Apple" "Australia"   32
    "AUS" "2003Dog"   "Australia"   60
    "AUS" "2004Apple" "Australia"   38
    "AUS" "2004Dog"   "Australia"   19
    "AUS" "2005Apple" "Australia"   34
    "AUS" "2005Dog"   "Australia"   44
    "AUS" "2006Apple" "Australia"   16
    "AUS" "2006Dog"   "Australia"    9
    "AUS" "2007Apple" "Australia"   39
    "AUS" "2007Dog"   "Australia"   25
    "AUT" "2000Apple" "Austria"     57
    "AUT" "2000Dog"   "Austria"     28
    "AUT" "2001Apple" "Austria"     11
    "AUT" "2001Dog"   "Austria"     59
    end
    Many thanks!
    Craig
    Last edited by Craig Yang; 26 Aug 2020, 01:17.

  • #2
    Try:

    Code:
    . ren (Y*Apple Y*Dog) (Apple* Dog*)
    
    . reshape long Apple Dog, i( CountryCode ) j(Year)
    (note: j = 2000 2001 2002 2003 2004 2005 2006 2007)
    
    Data                               wide   ->   long
    -----------------------------------------------------------------------------
    Number of obs.                      100   ->     800
    Number of variables                  18   ->       5
    j variable (8 values)                     ->   Year
    xij variables:
          Apple2000 Apple2001 ... Apple2007   ->   Apple
                Dog2000 Dog2001 ... Dog2007   ->   Dog
    -----------------------------------------------------------------------------
    
    . list in 1/21, sep(0)
    
         +---------------------------------------------+
         | Countr~e   Year       Country   Apple   Dog |
         |---------------------------------------------|
      1. |      AFG   2000   Afghanistan      55    52 |
      2. |      AFG   2001   Afghanistan      24    11 |
      3. |      AFG   2002   Afghanistan      47    54 |
      4. |      AFG   2003   Afghanistan      32     2 |
      5. |      AFG   2004   Afghanistan      54    28 |
      6. |      AFG   2005   Afghanistan      21    47 |
      7. |      AFG   2006   Afghanistan      32    26 |
      8. |      AFG   2007   Afghanistan      33    48 |
      9. |      AGO   2000        Angola      12    30 |
     10. |      AGO   2001        Angola      59    57 |
     11. |      AGO   2002        Angola      34    59 |
     12. |      AGO   2003        Angola      23    48 |
     13. |      AGO   2004        Angola      59    41 |
     14. |      AGO   2005        Angola      53    50 |
     15. |      AGO   2006        Angola      38    57 |
     16. |      AGO   2007        Angola      10    38 |
     17. |      ALB   2000       Albania      48     8 |
     18. |      ALB   2001       Albania      14    38 |
     19. |      ALB   2002       Albania      60    12 |
     20. |      ALB   2003       Albania      40    46 |
     21. |      ALB   2004       Albania      33    42 |
         +---------------------------------------------+

    Comment


    • #3
      Hi Joro, thank you very much! The code works smoothly.

      Since I have 12 variables and 18 years in total, I'm wondering if there's a way that I do not need to type out every variable and every year?

      Comment


      • #4
        https://www.stata.com/support/faqs/d...-with-reshape/ covers this at

        If I have many variables all occurring in pairs for two years 1997 and 1998, so that the dataset looks like A97, A98, B97, B98, and so on, is there any easy way to reshape the data to long without typing all the stub names?

        Code:
        unab stubs : Y2000* 
        local stubs : subinstr local stubs "Y2000" "", all

        Comment


        • #5
          Thank you, Nick! This is super helpful!

          In my case, I'll need to rename the variables before reshape.

          The following code will do the work:
          Code:
          unab stubs : Y2000*
          local stubs : subinstr local stubs "Y2000" "", all
          
          foreach stub of local stubs{
              rename Y*`stub' `stub'*
              }
              
          reshape long `stubs', i(CountryCode) j(Year)

          Comment


          • #6
            Hi all,
            I have this dataset. I have almost same problem . when i use reshape long yr, i(C_ISIN var) j(Year) string, it changes the dataset to completely long-shaped, with year-variable pairs in one column. But i want to create 12 columns based on listed values( MV, MVC,WC0125, etc) in the variable 'var'. can i create columns based on 8 distinct values in var variable. I am really struggling and if you can please help me. I need this solution for my Masters thesis. Thanks






            Screenshot 2023-02-20 024109.jpg

            Comment


            • #7
              Yes, it can probably be done. You have to first -reshape long -in the way you described doing, and then -reshape wide- with some different settings to get the values of var to become actual variable names. It's going to look something like this:
              Code:
               reshape long yr, i(C_ISIN var) j(Year) string
              
               reshape wide yr, i(C_ISIN Year) j(var) string
               rename yr* *
              But if this isn't exactly right, it requires a bit of tinkering with and testing the code to get it right the first time, which is rather difficult to do from a screenshot. Please post back showing example data by using the -dataex- command. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

              Also, in order for what you ask to be possible, the values in the variable var must, themselves, be legal Stata variable names. That is, they must not contain any spaces or special characters (other than _), and they must begin with a letter or _. Also, they must be at most 32 characters long. (And, the code above would actually only work with a maximum of 30 characters.) If these conditions are not met, then you would first have to change the values to things that conform to the rules of Stata variable names but resemble the original values as closely as possible.
              Last edited by Clyde Schechter; 19 Feb 2023, 20:17.

              Comment

              Working...
              X