Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dataset in Matrix format - Reshape with different column names

    Hi All,
    I am working on a dataset that is originally in the matrix format downloaded as an excel spreadsheet from the official website. However, I want to reshape the dataset (as per my understanding) in a long format, where columns and rows are rearranged in just 3 columns namely i (suppliers), j (users), and value. I looked at dataex to provide an example of the data that I have, but I couldn't figure out how to create an example. Therefore, I am providing an example below:
    Agriculture Manufacturing
    Codes i/j 111 1234
    111 Agriculture 0.40 0.30
    1234 Manufacturing 0.30 0.50
    5110 Services 0.20 0.10
    9100 Government 0.10 0.10
    I want to transform the table as follows:
    i (suppliers) Codes i j (users) Codes j Value
    Agriculture 111 Agriculture 111 0.40
    Manufacturing 1234 Agriculture 111 0.30
    Services 5110 Agriculture 111 0.20
    Government 9100 Agriculture 111 0.10
    Agriculture 111 Manufacturing 1234 0.30
    Manufacturing 1234 Manufacturing 1234 0.50
    Services 5110 Manufacturing 1234 0.10
    Government 9100 Manufacturing 1234 0.10
    I am trying to get a solution to this issue for a long time. I will be grateful if someone can guide me through this process. I have read on matrix and reshape command. However, reshape only works when column names have a common prefix. Any suggestions on this issue is greatly appreciated. Thanks!

    Preety

  • #2
    I looked at dataex to provide an example of the data that I have, but I couldn't figure out how to create an example.
    After importing the Excel file, copy and paste the result of

    Code:
    dataex in 1/20

    Comment


    • #3
      Hi Andrew, I tried the suggested command but the error message says "input statement exceeds linesize limit. Try specifying fewer variables". Thanks. In the columns - I have about 402 column names. I think that may be the issue when I try to produce an example using dataex.

      Comment


      • #4
        Select a few. Try:

        Code:
        qui ds
        dataex `=word("`r(varlist)'", 1)' - `=word("`r(varlist)'", 5)' in 1/20
        If that fails, change 5 to 4, 4 to 3 and so on in the above, until you are successful.

        Comment


        • #5
          This is what I got -
          . dataex `=word("`r(varlist)'", 1)' - `=word("`r(varlist)'", 5)' in 1/20

          ----------------------- copy starting from the next line -----------------------
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str20 A str96 B str17(C D) str27 E
          ""                     ""                                                                                           ""                ""              ""                           
          "Commodity / Industry" ""                                                                                           "Oilseed farming" "Grain farming" "Vegetable and melon farming"
          "Code"                 "Commodity Description"                                                                      "1111A0"          "1111B0"        "111200"                     
          "1111A0"               "Oilseed farming"                                                                            ".2180219"        ".0078682"      ".0021382"                   
          "1111B0"               "Grain farming"                                                                              "0"               ".1329214"      "0"                          
          "111200"               "Vegetable and melon farming"                                                                "0"               "0"             ".035414"                    
          "111300"               "Fruit and tree nut farming"                                                                 "0"               "0"             "0"                          
          "111400"               "Greenhouse, nursery, and floriculture production"                                           "0"               "0"             "0"                          
          "111900"               "Other crop farming"                                                                         ".004181"         ".0084153"      "0"                          
          "112120"               "Dairy cattle and milk production"                                                           "0"               "0"             "0"                          
          "1121A0"               "Beef cattle ranching and farming, including feedlots and dual-purpose ranching and farming" ".0008285"        ".0036489"      ".000941"                    
          "112300"               "Poultry and egg production"                                                                 ".0002802"        "0"             "0"                          
          "112A00"               "Animal production, except cattle and poultry and eggs"                                      ".0004539"        ".0024357"      ".000482"                    
          "113000"               "Forestry and logging"                                                                       "0"               "0"             "0"                          
          "114000"               "Fishing, hunting and trapping"                                                              "0"               "0"             "0"                          
          "115000"               "Support activities for agriculture and forestry"                                            ".034694"         ".1383307"      ".1233774"                   
          "211000"               "Oil and gas extraction"                                                                     "0"               "0"             "0"                          
          "212100"               "Coal mining"                                                                                "0"               "0"             "0"                          
          "212230"               "Copper, nickel, lead, and zinc mining"                                                      "0"               "0"             "0"                          
          "2122A0"               "Iron, gold, silver, and other metal ore mining"                                             "0"               "0"             "0"                          
          end
          ------------------ copy up to and including the previous line ------------------

          Listed 20 out of 410 observations

          Comment


          • #6
            Thanks.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str20 A str96 B str17(C D) str27 E
            ""                     ""                                                                                           ""                ""              ""                           
            "Commodity / Industry" ""                                                                                           "Oilseed farming" "Grain farming" "Vegetable and melon farming"
            "Code"                 "Commodity Description"                                                                      "1111A0"          "1111B0"        "111200"                     
            "1111A0"               "Oilseed farming"                                                                            ".2180219"        ".0078682"      ".0021382"                   
            "1111B0"               "Grain farming"                                                                              "0"               ".1329214"      "0"                          
            "111200"               "Vegetable and melon farming"                                                                "0"               "0"             ".035414"                    
            "111300"               "Fruit and tree nut farming"                                                                 "0"               "0"             "0"                          
            "111400"               "Greenhouse, nursery, and floriculture production"                                           "0"               "0"             "0"                          
            "111900"               "Other crop farming"                                                                         ".004181"         ".0084153"      "0"                          
            "112120"               "Dairy cattle and milk production"                                                           "0"               "0"             "0"                          
            "1121A0"               "Beef cattle ranching and farming, including feedlots and dual-purpose ranching and farming" ".0008285"        ".0036489"      ".000941"                    
            "112300"               "Poultry and egg production"                                                                 ".0002802"        "0"             "0"                          
            "112A00"               "Animal production, except cattle and poultry and eggs"                                      ".0004539"        ".0024357"      ".000482"                    
            "113000"               "Forestry and logging"                                                                       "0"               "0"             "0"                          
            "114000"               "Fishing, hunting and trapping"                                                              "0"               "0"             "0"                          
            "115000"               "Support activities for agriculture and forestry"                                            ".034694"         ".1383307"      ".1233774"                   
            "211000"               "Oil and gas extraction"                                                                     "0"               "0"             "0"                          
            "212100"               "Coal mining"                                                                                "0"               "0"             "0"                          
            "212230"               "Copper, nickel, lead, and zinc mining"                                                      "0"               "0"             "0"                          
            "2122A0"               "Iron, gold, silver, and other metal ore mining"                                             "0"               "0"             "0"                          
            end
            
            rename (A B) (code_supplier description_supplier)
            qui ds
            foreach var of varlist C - `=word("`r(varlist)'", -1)'{
                rename `var' `=strtoname("`=`var'[3]'")'
            }
            qui ds
            rename (`=word("`r(varlist)'", 3)' - `=word("`r(varlist)'", -1)') value=
            gen long id=_n
            drop in 1/3
            reshape long value, i(id) j(code_user) string
            replace code_user= trim(itrim(subinstr(code_user, "_", "", .)))
            frame put code_supplier description_supplier, into(join)
            frame join{
                contract *
            }
            frlink m:1 code_user, frame(join code_supplier)
            frget description_user = description_supplier, from(join)
            drop id join
            frame drop join
            order code_supplier description_supplier code_user description_user value
            Res.:

            Code:
            . l, sepby(code_supplier)
            
                 +-----------------------------------------------------------------------------------------------------------------------------------------------------------+
                 | code_s~r                                                                         description_supplier   code_u~r              description_user      value |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
              1. |   1111A0                                                                              Oilseed farming     1111A0               Oilseed farming   .2180219 |
              2. |   1111A0                                                                              Oilseed farming     1111B0                 Grain farming   .0078682 |
              3. |   1111A0                                                                              Oilseed farming     111200   Vegetable and melon farming   .0021382 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
              4. |   1111B0                                                                                Grain farming     1111A0               Oilseed farming          0 |
              5. |   1111B0                                                                                Grain farming     1111B0                 Grain farming   .1329214 |
              6. |   1111B0                                                                                Grain farming     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
              7. |   111200                                                                  Vegetable and melon farming     1111A0               Oilseed farming          0 |
              8. |   111200                                                                  Vegetable and melon farming     1111B0                 Grain farming          0 |
              9. |   111200                                                                  Vegetable and melon farming     111200   Vegetable and melon farming    .035414 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             10. |   111300                                                                   Fruit and tree nut farming     1111A0               Oilseed farming          0 |
             11. |   111300                                                                   Fruit and tree nut farming     1111B0                 Grain farming          0 |
             12. |   111300                                                                   Fruit and tree nut farming     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             13. |   111400                                             Greenhouse, nursery, and floriculture production     1111A0               Oilseed farming          0 |
             14. |   111400                                             Greenhouse, nursery, and floriculture production     1111B0                 Grain farming          0 |
             15. |   111400                                             Greenhouse, nursery, and floriculture production     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             16. |   111900                                                                           Other crop farming     1111A0               Oilseed farming    .004181 |
             17. |   111900                                                                           Other crop farming     1111B0                 Grain farming   .0084153 |
             18. |   111900                                                                           Other crop farming     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             19. |   112120                                                             Dairy cattle and milk production     1111A0               Oilseed farming          0 |
             20. |   112120                                                             Dairy cattle and milk production     1111B0                 Grain farming          0 |
             21. |   112120                                                             Dairy cattle and milk production     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             22. |   1121A0   Beef cattle ranching and farming, including feedlots and dual-purpose ranching and farming     1111A0               Oilseed farming   .0008285 |
             23. |   1121A0   Beef cattle ranching and farming, including feedlots and dual-purpose ranching and farming     1111B0                 Grain farming   .0036489 |
             24. |   1121A0   Beef cattle ranching and farming, including feedlots and dual-purpose ranching and farming     111200   Vegetable and melon farming    .000941 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             25. |   112300                                                                   Poultry and egg production     1111A0               Oilseed farming   .0002802 |
             26. |   112300                                                                   Poultry and egg production     1111B0                 Grain farming          0 |
             27. |   112300                                                                   Poultry and egg production     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             28. |   112A00                                        Animal production, except cattle and poultry and eggs     1111A0               Oilseed farming   .0004539 |
             29. |   112A00                                        Animal production, except cattle and poultry and eggs     1111B0                 Grain farming   .0024357 |
             30. |   112A00                                        Animal production, except cattle and poultry and eggs     111200   Vegetable and melon farming    .000482 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             31. |   113000                                                                         Forestry and logging     1111A0               Oilseed farming          0 |
             32. |   113000                                                                         Forestry and logging     1111B0                 Grain farming          0 |
             33. |   113000                                                                         Forestry and logging     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             34. |   114000                                                                Fishing, hunting and trapping     1111A0               Oilseed farming          0 |
             35. |   114000                                                                Fishing, hunting and trapping     1111B0                 Grain farming          0 |
             36. |   114000                                                                Fishing, hunting and trapping     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             37. |   115000                                              Support activities for agriculture and forestry     1111A0               Oilseed farming    .034694 |
             38. |   115000                                              Support activities for agriculture and forestry     1111B0                 Grain farming   .1383307 |
             39. |   115000                                              Support activities for agriculture and forestry     111200   Vegetable and melon farming   .1233774 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             40. |   211000                                                                       Oil and gas extraction     1111A0               Oilseed farming          0 |
             41. |   211000                                                                       Oil and gas extraction     1111B0                 Grain farming          0 |
             42. |   211000                                                                       Oil and gas extraction     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             43. |   212100                                                                                  Coal mining     1111A0               Oilseed farming          0 |
             44. |   212100                                                                                  Coal mining     1111B0                 Grain farming          0 |
             45. |   212100                                                                                  Coal mining     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             46. |   212230                                                        Copper, nickel, lead, and zinc mining     1111A0               Oilseed farming          0 |
             47. |   212230                                                        Copper, nickel, lead, and zinc mining     1111B0                 Grain farming          0 |
             48. |   212230                                                        Copper, nickel, lead, and zinc mining     111200   Vegetable and melon farming          0 |
                 |-----------------------------------------------------------------------------------------------------------------------------------------------------------|
             49. |   2122A0                                               Iron, gold, silver, and other metal ore mining     1111A0               Oilseed farming          0 |
             50. |   2122A0                                               Iron, gold, silver, and other metal ore mining     1111B0                 Grain farming          0 |
             51. |   2122A0                                               Iron, gold, silver, and other metal ore mining     111200   Vegetable and melon farming          0 |
                 +-----------------------------------------------------------------------------------------------------------------------------------------------------------+
            
            .

            Comment


            • #7
              Thanks so much, Andrew. This is great. I really appreciate you taking the time to help me with the dataset. I will try these codes today. Just one question - if I want to do it for all the columns of my dataset, do I need to make changes in this line of code? Thanks again!
              foreach var of varlist C - `=word("`r(varlist)'", -1)'{ rename `var' `=strtoname("`=`var'[3]'")' }

              Comment


              • #8
                No changes. The code generalizes to any number of variables in the dataset.

                Comment


                • #9
                  Thank you so much, Andrew. The code works well on my data and I am able to achieve the desired result. I am using this data for my thesis. I will be highly grateful to you if you could suggest me some resources that would help me in undertaking similar Stata tasks in the future. Looking forward to hearing from you. Thanks much

                  Comment


                  • #10
                    Here are three resources that I have found useful:

                    1. The Stata documentation, which can be accessed by typing:


                    Code:
                    help commandname
                    or

                    Code:
                    help functionname()
                    where "commandname" and "functionname()" refer to the name of the command or function that you are interested in. These will provide links to the PDF manual entries for the commands or functions and examples.

                    2. The Speaking Stata columns by Nick Cox in the Stata Journal. If you systematically go through these, you will gain expertise to tackle a good number of data management problems that often arise. You can find these at https://www.stata-journal.com/sjsear...speaking+stata.

                    3. Finally, I have also benefited from reading and understanding various posts on Statalist, particularly those by Nick Cox, Clyde Schechter, and Robert Picard.

                    Comment


                    • #11
                      Thanks, Andrew, for listing these resources. I really appreciate your help. Have a great weekend!

                      Comment


                      • #12
                        Hi All,
                        I am working with the input output tables. With (Andrew's) help, I managed to transform the dataset from matrix to long format. However, now I am trying to link BEA industry codes to NAICS and managed to do that mapping.
                        Since I wan to see how much one industry uses from another and vice-versa, I want to do two things.
                        1) BEA (code_supplier) to NAICS (naics_supplier) - how many NAICS codes are linked to BEA. For that, I created the tag and wanted variables. However, I want to sum the values column if there is 1 NAICS linked to multiple BEA codes. I am not sure how to achieve that. Is there a way to see such cases?
                        2) I want to produce values for each unique industry pair and how to see these pairs stacked over one another. For instance - soybean farming supplier (NAICS 111110) supplying to wheat farming user (NAICS 111140) is 455 and vice versa (wheat farming supplying to soybean farming is how much?). I want to produce unique pairwise combinations of buyers and sellers along with their values. I am not sure how to obtain this result. My dataset currently looks like this. I only want to have three columns based on the dataset (using NAICS) buyers, sellers, and value.

                        Code:
                        * Example generated by -dataex-. For more info, type help dataex
                        clear
                        input str6 code_supplier str96 description_supplier str6 code_user str96 description_user str119 naics_title_supplier long naics_supplier str6 naics_user strL naics_title_user long(value total_output_user) byte(tag wanted)
                        "S00401" "Scrap"                        "4200ID" "Customs duties"  "" . ""       ""                                    0 38513 0 0
                        "S00300" "Noncomparable imports"        "4200ID" "Customs duties"  "" . ""       ""                                    0 38513 0 0
                        "S00402" "Used and secondhand goods"    "4200ID" "Customs duties"  "" . ""       ""                                    0 38513 0 0
                        "S00900" "Rest of the world adjustment" "4200ID" "Customs duties"  "" . ""       ""                                    0 38513 0 0
                        "4200ID" "Customs duties"               "4200ID" "Customs duties"  "" . ""       ""                                    0 38513 0 0
                        "4200ID" "Customs duties"               "1111A0" "Oilseed farming" "" . "111110" "Soybean Farming"                     0 38216 0 0
                        "S00402" "Used and secondhand goods"    "1111A0" "Oilseed farming" "" . "111110" "Soybean Farming"                     0 38216 0 0
                        "S00900" "Rest of the world adjustment" "1111A0" "Oilseed farming" "" . "111110" "Soybean Farming"                     0 38216 0 0
                        "S00300" "Noncomparable imports"        "1111A0" "Oilseed farming" "" . "111110" "Soybean Farming"                   111 38216 0 0
                        "S00401" "Scrap"                        "1111A0" "Oilseed farming" "" . "111110" "Soybean Farming"                     0 38216 0 0
                        "S00900" "Rest of the world adjustment" "1111A0" "Oilseed farming" "" . "111120" "Oilseed (except Soybean) Farming "   0 38216 0 0
                        "4200ID" "Customs duties"               "1111A0" "Oilseed farming" "" . "111120" "Oilseed (except Soybean) Farming "   0 38216 0 0
                        "S00300" "Noncomparable imports"        "1111A0" "Oilseed farming" "" . "111120" "Oilseed (except Soybean) Farming " 111 38216 0 0
                        "S00401" "Scrap"                        "1111A0" "Oilseed farming" "" . "111120" "Oilseed (except Soybean) Farming "   0 38216 0 0
                        "S00402" "Used and secondhand goods"    "1111A0" "Oilseed farming" "" . "111120" "Oilseed (except Soybean) Farming "   0 38216 0 0
                        "4200ID" "Customs duties"               "1111B0" "Grain farming"   "" . "111130" "Dry Pea and Bean Farming "           0 57882 0 0
                        "S00401" "Scrap"                        "1111B0" "Grain farming"   "" . "111130" "Dry Pea and Bean Farming "           0 57882 0 0
                        "S00900" "Rest of the world adjustment" "1111B0" "Grain farming"   "" . "111130" "Dry Pea and Bean Farming "           0 57882 0 0
                        "S00402" "Used and secondhand goods"    "1111B0" "Grain farming"   "" . "111130" "Dry Pea and Bean Farming "           0 57882 0 0
                        "S00300" "Noncomparable imports"        "1111B0" "Grain farming"   "" . "111130" "Dry Pea and Bean Farming "         371 57882 0 0
                        end
                        Thanks for your help in advance!

                        Preety
                        Last edited by Preety Bhogal; 28 Mar 2024, 22:02.

                        Comment

                        Working...
                        X