Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping Data, Wide format from long


    Hello everyone,

    I'm having trouble formatting my data. I have a dataset that looks like this:

    Country Year Gender Observation
    Afghanistan 1976 Female 112587
    Afghanistan 1976 All genders 729667
    Afghanistan 1975 All genders 692342
    Afghanistan 1975 Female 104019
    Albania 2014 All genders 195720
    Albania 2014 Female 92609

    I want to format it to look like this instead:

    Country Year Female All genders
    Afghanistan 1976 112587 729667
    Afghanistan 1975 104019 692342
    Albania 2014 92609 195720

    Any idea on how I could do this?
    Thanks



  • #2
    Perhaps this example will start you in a useful direction.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str11 country int year str11 gender long observation
    "Afghanistan" 1976 "Female"      112587
    "Afghanistan" 1976 "All genders" 729667
    "Afghanistan" 1975 "All genders" 692342
    "Afghanistan" 1975 "Female"      104019
    "Albania"     2014 "All genders" 195720
    "Albania"     2014 "Female"       92609
    end
    replace gender = "All" if gender=="All genders"
    reshape wide observation, i(country year) j(gender) string
    rename (observation*) (*)
    Code:
    . list
    
         +--------------------------------------+
         |     country   year      All   Female |
         |--------------------------------------|
      1. | Afghanistan   1975   692342   104019 |
      2. | Afghanistan   1976   729667   112587 |
      3. |     Albania   2014   195720    92609 |
         +--------------------------------------+
    Please note the way I presented your example data using the dataex command, once I had finished messing around to get your listing into Stata as usable data (and in the process of messing about, I lost the capitalization of the variable names). In the future, please help those whose help you seek by using the dataex command to show Stata example data.

    While the data you have shown was not difficult to import to Stata, that is not always the case. And even when data go in easily, we always lose information such as data storage types, value labelling, and display formats. Sometimes those things are crucial to getting the solution right. When you use dataex you enable those who want to help you to to quickly and easily create a 100% faithful replica of your situation by copying from your post and pasting into Stata. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it.

    Added in edit: crossed with #3, which presents a more general solution to the problem of variable values unsuitable for use as a variable name. But we both agree that your future posts will be more attractive to respond to if you follow the guidance in the Statalist FAQ linked to from the top of every page, and from the page presented when creating a new post.
    Last edited by William Lisowski; 05 Feb 2022, 17:37.

    Comment


    • #3
      Well, you can't do exactly that because All genders is not a legal Stata variable name: embedded blanks are not allowed. But you can come very close:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str11 country int year str11 gender long observation
      "Afghanistan" 1976 "Female"      112587
      "Afghanistan" 1976 "All genders" 729667
      "Afghanistan" 1975 "All genders" 692342
      "Afghanistan" 1975 "Female"      104019
      "Albania"     2014 "All genders" 195720
      "Albania"     2014 "Female"       92609
      end
      
      replace gender = strtoname(lower(gender))
      rename observation o_
      reshape wide o_, i(country year) j(gender) string
      rename o_* *
      In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      Added: Crossed with #2, with a largely similar solution.

      Comment

      Working...
      X