Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about xpose and sxpose?

    HI Folks, any chance there is an option I'm overlooking in xpose or sxpose here?

    Mock version of my data:
    Code:
    clear all
    
    input str40 words firm1 firm2 firm3
                "cat"   28     0     0 
                "dog"   26     0    26 
                "hat"   21     0    14 
                "desk"  19     0     7 
                "car"   17     0     2 
    end
    
    . list
    
         +-------------------------------+
         | words   firm1   firm2   firm3 |
         |-------------------------------|
      1. |   cat      28       0       0 |
      2. |   dog      26       0      26 |
      3. |   hat      21       0      14 |
      4. |  desk      19       0       7 |
      5. |   car      17       0       2 |
         +-------------------------------+
    The goal (variable for each of the words plus a variable for the firms):
    Code:
         +-----------------------------------------+
         | cat   dog   hat   desk   car   firmname |
         |-----------------------------------------|
      1. |  28    26    21     19    17      firm1 |
      2. |   0     0     0      0     0      firm2 |
      3. |   0    26    14      7     2      firm3 |
         +-----------------------------------------+
    Worked on a few possible solutions. The two best are below. Anyone able to help me figure out the final touches?

    First attempt (lost track of the words):
    Code:
    xpose, clear varname
    list
    
         +-----------------------------------+
         | v1   v2   v3   v4   v5   _varname |
         |-----------------------------------|
      1. |  .    .    .    .    .      words |
      2. | 28   26   21   19   17      firm1 |
      3. |  0    0    0    0    0      firm2 |
      4. |  0   26   14    7    2      firm3 |
         +-----------------------------------+
    Second attempt (lost track of firm name):
    Code:
    sxpose, clear force firstnames destring
    list
    
         +------------------------------+
         | cat   dog   hat   desk   car |
         |------------------------------|
      1. |  28    26    21     19    17 |
      2. |   0     0     0      0     0 |
      3. |   0    26    14      7     2 |
         +------------------------------+

  • #2
    For problems like this - where there is mix of numeric and string data - using reshape twice is often the easiest, if least intuitive, approach. Here's an example, where I've changed your firm names to something less suggestive of having a pattern.

    Added in edit: this code assumes that the values of the variable "words" are distinct, and that each value is usable as a Stata variable name. The latter can be addressed with the strtoname() function, the former by using the observation number to create an id variable and adjusting the first reshape command accordingly.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 words float(audi fiat mini)
    "cat"  28 0  0
    "dog"  26 0 26
    "hat"  21 0 14
    "desk" 19 0  7
    "car"  17 0  2
    end
    
    isid words
    rename (audi fiat mini) (value=)
    reshape long value, i(words) j(firmname) string
    list in 1/6, noobs sepby(words)
    
    reshape wide value, i(firmname) j(words) string
    rename (value*) (*)
    list, noobs
    Code:
    . isid words
    
    . rename (audi fiat mini) (value=)
    
    . reshape long value, i(words) j(firmname) string
    (note: j = audi fiat mini)
    
    Data                               wide   ->   long
    -----------------------------------------------------------------------------
    Number of obs.                        5   ->      15
    Number of variables                   4   ->       3
    j variable (3 values)                     ->   firmname
    xij variables:
              valueaudi valuefiat valuemini   ->   value
    -----------------------------------------------------------------------------
    
    . list in 1/6, noobs sepby(words)
    
      +--------------------------+
      | words   firmname   value |
      |--------------------------|
      |   car       audi      17 |
      |   car       fiat       0 |
      |   car       mini       2 |
      |--------------------------|
      |   cat       audi      28 |
      |   cat       fiat       0 |
      |   cat       mini       0 |
      +--------------------------+
    
    .
    . reshape wide value, i(firmname) j(words) string
    (note: j = car cat desk dog hat)
    
    Data                               long   ->   wide
    -----------------------------------------------------------------------------
    Number of obs.                       15   ->       3
    Number of variables                   3   ->       6
    j variable (5 values)             words   ->   (dropped)
    xij variables:
                                      value   ->   valuecar valuecat ... valuehat
    -----------------------------------------------------------------------------
    
    . rename (value*) (*)
    
    . list, noobs
    
      +-----------------------------------------+
      | firmname   car   cat   desk   dog   hat |
      |-----------------------------------------|
      |     audi    17    28     19    26    21 |
      |     fiat     0     0      0     0     0 |
      |     mini     2     0      7    26    14 |
      +-----------------------------------------+
    Last edited by William Lisowski; 19 May 2018, 17:03.

    Comment


    • #3
      Here's another approach--not better, just different. It's just the -xpose- with a renaming of the variables.
      Code:
      clear all
      input str40 words firm1 firm2 firm3
      "cat" 28 0 0
      "dog" 26 0 26
      "hat" 21 0 14
      "desk" 19 0 7
      "car" 17 0 2
      end
      //
      levelsof words, local(wordlist)  // will become varnames
      local nvar = r(r)
      xpose, clear varname
      drop in 1
      // Should be a way to do the following with a one line -rename group-,
      // but I couldn't figure it out, so I looped.
      forval i = 1/`nvar' {
        local nextvar: word `i' of `wordlist'
        rename v`i' `nextvar'
      }
      rename _varname firm

      Comment


      • #4
        Thank you both so much for the suggestions. So helpful.

        I saw Mike's solution as robust to datasets with other more dynamic sets of firm names.

        I did need to add the line:

        Code:
        local nvar = c(k) - 1
        Thus, the solution I'll be going with (modified to represent people instead of firms)...

        Code:
        clear all
        input str40 words adam mike will dave nick carl
          "cat"  28 0 0  3  5  10
          "dog"  26 0 26 4  0  20
          "hat"  21 0 14 10 3   4
          "desk" 19 0 7  12 8   3
          "car"  17 0 2  9  9  20
        end
        
        levelsof words, local(wordlist)  // will become varnames
        local nvar = r(r)
        xpose, clear varname
        drop in 1
        
        // Should be a way to do the following with a one line -rename group-,
        // but I couldn't figure it out, so I looped.
        
        local nvar = c(k) - 1
        forval i = 1/`nvar' {
          local nextvar: word `i' of `wordlist'
          rename v`i' `nextvar'
        }
        rename _varname firm
        
        list, sep(0)

        Comment


        • #5
          sxpose is from SSC, as you are asked to explain (FAQ Advice #12). It presumably was the answer to somebody's question in 2004, but I would still reach for reshape. This has one fewer rename than William's but relies on your words being acceptable names.

          Code:
          clear all
          
          input str40 words firm1 firm2 firm3
                      "cat"   28     0     0 
                      "dog"   26     0    26 
                      "hat"   21     0    14 
                      "desk"  19     0     7 
                      "car"   17     0     2 
          end
          
          reshape long firm, i(words) j(id) 
          reshape wide firm, i(id) j(words) string 
          rename (firm*) (*) 
          
          list 
          
               +-----------------------------------+
               | id   car   cat   desk   dog   hat |
               |-----------------------------------|
            1. |  1    17    28     19    26    21 |
            2. |  2     0     0      0     0     0 |
            3. |  3     2     0      7    26    14 |
               +-----------------------------------+

          Comment


          • #6
            Thanks for the additional input, Nick. Thanks also for the FAQ reminder. My apologies for missing that one.

            Reshape, though elegant, isn't the solution for me. Though the mock data given fit a sequential firm nomenclature (e.g. firm1 firm2 firm3... firmk), my data won't necessarily satisfy that assumption.

            The "firm names" are just as likely to be a set of other names or identifiers such as person names (adam, mike, dave, nick, carl). Or there could be city names (madison, chicago, london, paris, tokyo). Another example could be places to eat (cafe, fastfood, subshop, pub, foodtruck).

            Correct me if I'm wrong but reshape wouldn't work when the firm names lack a common stub and a sequential suffix, eh?

            Comment


            • #7
              From post #6

              Correct me if I'm wrong but reshape wouldn't work when the firm names lack a common stub and a sequential suffix, eh?
              I anticipated that the firm names given in post #1 were not representative of the actual firm names and in post #2 I wrote

              Here's an example, where I've changed your firm names to something less suggestive of having a pattern.
              The technique is to use a trivial batch rename to prepend a common stub to a set of firm names that lack a common stub and a sequential suffix, and to use the string option on reshape long.
              Last edited by William Lisowski; 20 May 2018, 10:14.

              Comment


              • #8
                William is right. Arbitrary names are not a barrier to reshape here. They just mean that you have to do more work. Here is an example showing another way to do it.

                Code:
                clear all
                
                input str40 words firm1 firm2 firm3
                            "cat 1"   28     0     0 
                            "dog 2"   26     0    26 
                            "hat 3"   21     0    14 
                            "desk 4"  19     0     7 
                            "car 5"   17     0     2 
                end
                
                egen id = group(words), label 
                su id, meanonly  
                local J `r(max)'
                 
                forval j = 1/`J' { 
                    local label`j' "`: label (id) `j''" 
                }
                
                reshape long firm, i(id) j(ID) 
                
                drop words 
                reshape wide firm, i(ID) j(id)  
                
                forval j = 1/`J' { 
                    label var firm`j' "`label`j''"
                }
                
                describe 
                
                list 
                
                . describe 
                
                Contains data
                  obs:             3                          
                 vars:             6                          
                 size:            63                          
                -------------------------------------------------------------------------------------
                              storage   display    value
                variable name   type    format     label      variable label
                -------------------------------------------------------------------------------------
                ID              byte    %9.0g                 
                firm1           float   %9.0g                 car 5
                firm2           float   %9.0g                 cat 1
                firm3           float   %9.0g                 desk 4
                firm4           float   %9.0g                 dog 2
                firm5           float   %9.0g                 hat 3
                -------------------------------------------------------------------------------------
                Sorted by: ID
                
                . 
                . list 
                
                     +--------------------------------------------+
                     | ID   firm1   firm2   firm3   firm4   firm5 |
                     |--------------------------------------------|
                  1. |  1      17      28      19      26      21 |
                  2. |  2       0       0       0       0       0 |
                  3. |  3       2       0       7      26      14 |
                     +--------------------------------------------+

                Comment


                • #9
                  As well noted by Nick at #5, your words must be accepted as variable names. In this assumption, as suggested by Mike at #3, it does exist a rename solution, which is working along with the help of clean option for levelof. Anyhow, my favorite is still going for reshape solution as showed by William and Nick.
                  Code:
                  levelsof words, local(wordlist) clean
                  xpose, clear varname
                  ren (v*) (`wordlist')
                  ren _varname firm
                  drop in 1
                  Last edited by Romalpa Akzo; 21 May 2018, 05:30.

                  Comment


                  • #10
                    deleted post
                    Last edited by michael joe; 10 Mar 2020, 13:13.

                    Comment


                    • #11
                      Hi - I am trying to use this code with my data:
                      rename _cpc21_2 words

                      levelsof words, local(wordlist) // will become varnames
                      local nvar = r(r)
                      xpose, clear varname

                      // Should be a way to do the following with a one line -rename group-,
                      // but I couldn't figure it out, so I looped.

                      local nvar = c(k) - 1

                      forval i = 1/`nvar' {
                      local nextvar: word `i' of `wordlist'
                      rename v`i' `nextvar'
                      }

                      Where _cpc21_2 is product codes with an underscore i.e. _1, _2, _3.

                      It is giving me the below error - and my variables are not long or beging with numeric:


                      too many wildcards in newname
                      You requested v1 be renamed _.. There are more wildcards in new than in old. Wildcards in
                      old and new correspond one-to-one from left-to-right unless you specify explicit subscripts
                      in new. Anyway, rename ran out of wildcards in old when matching the wildcards in new.
                      Perhaps you just made a mistake or perhaps you forgot an explicit subscript in new. Or
                      perhaps you forgot to specify option addnumber, which allows you to specify an extra #, (#),
                      (##), ...

                      If I do the reshape option then it says that j(id) contains all msising variables

                      and also if i do:
                      ren (v*) (`wordlist')
                      oldnames and newnames do not match
                      You specified 1 pattern or name for oldname and 2 patterns or names for newnames.
                      Last edited by Eleanor Keeble; 28 Dec 2022, 11:43.

                      Comment


                      • #12
                        Show us a sample of your dataset. The codes suggest that you have a variable named _cpc21_2 and variables named v1, v2, v3, and so on. So, you may copy and paste the result of

                        Code:
                        dataex _cpc21_2 v1-v5 in 1/5

                        Comment

                        Working...
                        X