Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape long to wide

    Hi everyone,
    Im a newbie in stata and will really appreciate if somebody will help me.

    I have searched the forum, but haven't found an answer to my problem i would like to reshape my data from long to wide and have a separate column for each service sector and its respected values i have uploaded a pic of the data i have

    Click image for larger version

Name:	pic data.png
Views:	1
Size:	101.0 KB
ID:	1538886

  • #2
    Hello everybody,
    will it be possible to get help about this.
    Thanks in advance.

    Comment


    • #3
      Please let me note, per the StataList FAQ, that image postings of data and the like are deprecated, and that using -dataex- to post data examples is strongly recommended. If you had used -dataex-, answering your question would have been easier and you would have been more likely to get a quick answer. Knowing about this would likely help you in the future.

      Also: By "each service sector" I have presumed you to mean "each ProductSector."

      The one difficulty in your situation is that ProductSector is a relatively long string, which will not work nicely as the "j" part of the eventual values* names in the wide version of your data set after the reshape. One can easily enough use -encode- to make ProductSector into a numeric variable with the original strings as value labels, and then reshape, but the harder part is to retain the original string information to use as variable labels for the values* fields after the reshape. Here's a way to do that, but I hope someone else can show an easier way to do this than I can <grin>.

      Finally: Wide format is rarely desirable for data analysis in Stata. If you can explain why you want it, there's a reasonable chance we can explain it to be unnecessary.

      Code:
      encode ProductSector, gen(psector) // numeric; gets around long strings
      // Save the string information for use as variable labels
      // for the values* variables in the wide data set.
      levelsof psector, local(pslist)
      foreach p of local pslist {
         local varlabel`p': label (psector) `p'
         di "`varlabel`p''"
      }
      drop ProductSector // no longer needed
      // Straightforward reshape
      reshape wide values, i(id year) j(psector)
      // Put the variable labels on the new values* variables
      foreach p of local pslist {
         label var values`p' "`varlabel`p''"
      }
      Last edited by Mike Lacy; 28 Feb 2020, 10:34.

      Comment


      • #4
        Thank you very much for the reply Mike,
        I will take the note and use dataex to produce an example for the future.

        The reason behind me wanting to do this is to examine the effect of each sector individually and to merge it with another dataset about new firm formation on a country level.
        Im unsure if this was the right way to do it, but i decided to start like this.

        Again thank you for the reply and for being so kind as to explain about dataex and my problem.


        Comment


        • #5
          If you're wanting to look at sectors individually, I think it would likely be much easier in the long format. (The -by- command and -statsby- commands would be relevant here.)

          Comment


          • #6
            Thank you again so much for your help Mike,
            since im new to stata i was unaware about those commands. I decided to leave the data in long format and use them to look at each sector individually.

            But,im facing with another problem now that i can't seem to find a solution.
            Im trying to merge two data sets and i get the following error "variables Country year do not uniquely identify observations in the master data"

            The first data set looks like this :
            Code:
            input str1 Country int year byte(var1 var2 var3)
            "A" 2013  6  6  7
            "A" 2013  8  1  3
            "A" 2013  4  8  9
            "A" 2013 10 14 15
            "A" 2014  1  2  3
            "A" 2014  1  5  4
            "A" 2014  6  8  5
            "A" 2015  5  7  5
            "B" 2013  7  3  7
            "B" 2013  8  5  8
            end
            The Second like this :

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str1 Country int year byte(var1 var2 var3)
            "A" 2013 1 4 6
            "A" 2014 2 4 6
            "A" 2015 2 4 6
            "B" 2013 3 4 6
            "B" 2014 1 2 3
            end
            Last edited by George Hristov; 29 Feb 2020, 10:24.

            Comment


            • #7
              Another recommendation of the FAQ is to always show exactly what command you gave and exactly what Stata responded. Without that, you are leaving us to guess about what you did, and we can't effectively help you. My guess is that you tried a merge command with 1:1, or without specifying both Country and year as key variables. (I am guessing that Country and year constitute your key variables.)

              You will help yourself by reading -help merge- with particular attention to 1:1/m:1/1:m (1 to 1, 1 to many, many to 1), and about the possibility of having multiple variables as the key. A *possible* solution to your problem is the following:
              Code:
              use "YourMasterFile.dta"
              // The "using" file is Stata terminology for the other file in a merge.
              merge m:1 Country year using "YourUsingFile.dta"
              This is telling Stata: "Each observation in the master file is identified by a combination of Country and year, but there are many observations for each such combination. In the using file, there is only one observation of Country and year. Many observations in the master file are therefore intended to be matched by one observation take from the using file."

              Comment


              • #8
                This question was reposted an answered similarly at

                https://www.statalist.org/forums/for...panel-datasets

                Comment


                • #9
                  Thank you so much Mike thats exactly what i needed sorry for making you guess.

                  I will take the note and read the FAQ more throughly and give more info next time i ask for help.

                  Thank you again so much for your help i highly appreciated it with your help i manage to solve my problem.
                  Last edited by George Hristov; 29 Feb 2020, 12:17.

                  Comment

                  Working...
                  X