Question about xpose and sxpose?

Adam Ross Nelson

Join Date: Aug 2015
Posts: 37

Question about xpose and sxpose?

19 May 2018, 15:04

HI Folks, any chance there is an option I'm overlooking in xpose or sxpose here?

Mock version of my data:

Code:

clear all

input str40 words firm1 firm2 firm3
            "cat"   28     0     0 
            "dog"   26     0    26 
            "hat"   21     0    14 
            "desk"  19     0     7 
            "car"   17     0     2 
end

. list

     +-------------------------------+
     | words   firm1   firm2   firm3 |
     |-------------------------------|
  1. |   cat      28       0       0 |
  2. |   dog      26       0      26 |
  3. |   hat      21       0      14 |
  4. |  desk      19       0       7 |
  5. |   car      17       0       2 |
     +-------------------------------+

The goal (variable for each of the words plus a variable for the firms):

Code:

     +-----------------------------------------+
     | cat   dog   hat   desk   car   firmname |
     |-----------------------------------------|
  1. |  28    26    21     19    17      firm1 |
  2. |   0     0     0      0     0      firm2 |
  3. |   0    26    14      7     2      firm3 |
     +-----------------------------------------+

Worked on a few possible solutions. The two best are below. Anyone able to help me figure out the final touches?

First attempt (lost track of the words):

Code:

xpose, clear varname
list

     +-----------------------------------+
     | v1   v2   v3   v4   v5   _varname |
     |-----------------------------------|
  1. |  .    .    .    .    .      words |
  2. | 28   26   21   19   17      firm1 |
  3. |  0    0    0    0    0      firm2 |
  4. |  0   26   14    7    2      firm3 |
     +-----------------------------------+

Second attempt (lost track of firm name):

Code:

sxpose, clear force firstnames destring
list

     +------------------------------+
     | cat   dog   hat   desk   car |
     |------------------------------|
  1. |  28    26    21     19    17 |
  2. |   0     0     0      0     0 |
  3. |   0    26    14      7     2 |
     +------------------------------+

Tags: None

William Lisowski

Join Date: Dec 2014
Posts: 10150

19 May 2018, 16:58

For problems like this - where there is mix of numeric and string data - using reshape twice is often the easiest, if least intuitive, approach. Here's an example, where I've changed your firm names to something less suggestive of having a pattern.

Added in edit: this code assumes that the values of the variable "words" are distinct, and that each value is usable as a Stata variable name. The latter can be addressed with the strtoname() function, the former by using the observation number to create an id variable and adjusting the first reshape command accordingly.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str8 words float(audi fiat mini)
"cat"  28 0  0
"dog"  26 0 26
"hat"  21 0 14
"desk" 19 0  7
"car"  17 0  2
end

isid words
rename (audi fiat mini) (value=)
reshape long value, i(words) j(firmname) string
list in 1/6, noobs sepby(words)

reshape wide value, i(firmname) j(words) string
rename (value*) (*)
list, noobs

Code:

. isid words

. rename (audi fiat mini) (value=)

. reshape long value, i(words) j(firmname) string
(note: j = audi fiat mini)

Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                        5   ->      15
Number of variables                   4   ->       3
j variable (3 values)                     ->   firmname
xij variables:
          valueaudi valuefiat valuemini   ->   value
-----------------------------------------------------------------------------

. list in 1/6, noobs sepby(words)

  +--------------------------+
  | words   firmname   value |
  |--------------------------|
  |   car       audi      17 |
  |   car       fiat       0 |
  |   car       mini       2 |
  |--------------------------|
  |   cat       audi      28 |
  |   cat       fiat       0 |
  |   cat       mini       0 |
  +--------------------------+

.
. reshape wide value, i(firmname) j(words) string
(note: j = car cat desk dog hat)

Data                               long   ->   wide
-----------------------------------------------------------------------------
Number of obs.                       15   ->       3
Number of variables                   3   ->       6
j variable (5 values)             words   ->   (dropped)
xij variables:
                                  value   ->   valuecar valuecat ... valuehat
-----------------------------------------------------------------------------

. rename (value*) (*)

. list, noobs

  +-----------------------------------------+
  | firmname   car   cat   desk   dog   hat |
  |-----------------------------------------|
  |     audi    17    28     19    26    21 |
  |     fiat     0     0      0     0     0 |
  |     mini     2     0      7    26    14 |
  +-----------------------------------------+

Last edited by William Lisowski; 19 May 2018, 17:03.

Comment

Mike Lacy

Join Date: Apr 2014
Posts: 2416

19 May 2018, 17:10

Here's another approach--not better, just different. It's just the -xpose- with a renaming of the variables.

Code:

clear all
input str40 words firm1 firm2 firm3
"cat" 28 0 0
"dog" 26 0 26
"hat" 21 0 14
"desk" 19 0 7
"car" 17 0 2
end
//
levelsof words, local(wordlist)  // will become varnames
local nvar = r(r)
xpose, clear varname
drop in 1
// Should be a way to do the following with a one line -rename group-,
// but I couldn't figure it out, so I looped.
forval i = 1/`nvar' {
  local nextvar: word `i' of `wordlist'
  rename v`i' `nextvar'
}
rename _varname firm

Comment

Adam Ross Nelson

Join Date: Aug 2015
Posts: 37

19 May 2018, 17:34

Thank you both so much for the suggestions. So helpful.

I saw Mike's solution as robust to datasets with other more dynamic sets of firm names.

I did need to add the line:

Code:

local nvar = c(k) - 1

Thus, the solution I'll be going with (modified to represent people instead of firms)...

Code:

clear all
input str40 words adam mike will dave nick carl
  "cat"  28 0 0  3  5  10
  "dog"  26 0 26 4  0  20
  "hat"  21 0 14 10 3   4
  "desk" 19 0 7  12 8   3
  "car"  17 0 2  9  9  20
end

levelsof words, local(wordlist)  // will become varnames
local nvar = r(r)
xpose, clear varname
drop in 1

// Should be a way to do the following with a one line -rename group-,
// but I couldn't figure it out, so I looped.

local nvar = c(k) - 1
forval i = 1/`nvar' {
  local nextvar: word `i' of `wordlist'
  rename v`i' `nextvar'
}
rename _varname firm

list, sep(0)

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

20 May 2018, 01:15

sxpose is from SSC, as you are asked to explain (FAQ Advice #12). It presumably was the answer to somebody's question in 2004, but I would still reach for reshape. This has one fewer rename than William's but relies on your words being acceptable names.

Code:

clear all

input str40 words firm1 firm2 firm3
            "cat"   28     0     0 
            "dog"   26     0    26 
            "hat"   21     0    14 
            "desk"  19     0     7 
            "car"   17     0     2 
end

reshape long firm, i(words) j(id) 
reshape wide firm, i(id) j(words) string 
rename (firm*) (*) 

list 

     +-----------------------------------+
     | id   car   cat   desk   dog   hat |
     |-----------------------------------|
  1. |  1    17    28     19    26    21 |
  2. |  2     0     0      0     0     0 |
  3. |  3     2     0      7    26    14 |
     +-----------------------------------+

Comment

Adam Ross Nelson

Join Date: Aug 2015

Posts: 37
#6

20 May 2018, 09:17

Thanks for the additional input, Nick. Thanks also for the FAQ reminder. My apologies for missing that one.

Reshape, though elegant, isn't the solution for me. Though the mock data given fit a sequential firm nomenclature (e.g. firm1 firm2 firm3... firmk), my data won't necessarily satisfy that assumption.

The "firm names" are just as likely to be a set of other names or identifiers such as person names (adam, mike, dave, nick, carl). Or there could be city names (madison, chicago, london, paris, tokyo). Another example could be places to eat (cafe, fastfood, subshop, pub, foodtruck).

Correct me if I'm wrong but reshape wouldn't work when the firm names lack a common stub and a sequential suffix, eh?
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#7

20 May 2018, 10:03

From post #6

Correct me if I'm wrong but reshape wouldn't work when the firm names lack a common stub and a sequential suffix, eh?

I anticipated that the firm names given in post #1 were not representative of the actual firm names and in post #2 I wrote

Here's an example, where I've changed your firm names to something less suggestive of having a pattern.

The technique is to use a trivial batch rename to prepend a common stub to a set of firm names that lack a common stub and a sequential suffix, and to use the string option on reshape long.

Last edited by William Lisowski; 20 May 2018, 10:14.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

20 May 2018, 11:31

William is right. Arbitrary names are not a barrier to reshape here. They just mean that you have to do more work. Here is an example showing another way to do it.

Code:

clear all

input str40 words firm1 firm2 firm3
            "cat 1"   28     0     0 
            "dog 2"   26     0    26 
            "hat 3"   21     0    14 
            "desk 4"  19     0     7 
            "car 5"   17     0     2 
end

egen id = group(words), label 
su id, meanonly  
local J `r(max)'
 
forval j = 1/`J' { 
    local label`j' "`: label (id) `j''" 
}

reshape long firm, i(id) j(ID) 

drop words 
reshape wide firm, i(ID) j(id)  

forval j = 1/`J' { 
    label var firm`j' "`label`j''"
}

describe 

list 

. describe 

Contains data
  obs:             3                          
 vars:             6                          
 size:            63                          
-------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
-------------------------------------------------------------------------------------
ID              byte    %9.0g                 
firm1           float   %9.0g                 car 5
firm2           float   %9.0g                 cat 1
firm3           float   %9.0g                 desk 4
firm4           float   %9.0g                 dog 2
firm5           float   %9.0g                 hat 3
-------------------------------------------------------------------------------------
Sorted by: ID

. 
. list 

     +--------------------------------------------+
     | ID   firm1   firm2   firm3   firm4   firm5 |
     |--------------------------------------------|
  1. |  1      17      28      19      26      21 |
  2. |  2       0       0       0       0       0 |
  3. |  3       2       0       7      26      14 |
     +--------------------------------------------+

Comment

Romalpa Akzo

Join Date: Oct 2017

Posts: 369
#9

21 May 2018, 05:17

As well noted by Nick at #5, your words must be accepted as variable names. In this assumption, as suggested by Mike at #3, it does exist a rename solution, which is working along with the help of clean option for levelof. Anyhow, my favorite is still going for reshape solution as showed by William and Nick.

Code:

levelsof words, local(wordlist) clean xpose, clear varname ren (v*) (`wordlist') ren _varname firm drop in 1

Last edited by Romalpa Akzo; 21 May 2018, 05:30.
2 likes
Comment
michael joe

Join Date: May 2015

Posts: 50
#10

10 Mar 2020, 13:05

deleted post

Last edited by michael joe; 10 Mar 2020, 13:13.
Comment
Eleanor Keeble

Join Date: Dec 2022

Posts: 1
#11

28 Dec 2022, 11:28

Hi - I am trying to use this code with my data:
rename _cpc21_2 words

levelsof words, local(wordlist) // will become varnames
local nvar = r(r)
xpose, clear varname

// Should be a way to do the following with a one line -rename group-,
// but I couldn't figure it out, so I looped.

local nvar = c(k) - 1

forval i = 1/`nvar' {
local nextvar: word `i' of `wordlist'
rename v`i' `nextvar'
}

Where _cpc21_2 is product codes with an underscore i.e. _1, _2, _3.

It is giving me the below error - and my variables are not long or beging with numeric:

too many wildcards in newname
You requested v1 be renamed _.. There are more wildcards in new than in old. Wildcards in
old and new correspond one-to-one from left-to-right unless you specify explicit subscripts
in new. Anyway, rename ran out of wildcards in old when matching the wildcards in new.
Perhaps you just made a mistake or perhaps you forgot an explicit subscript in new. Or
perhaps you forgot to specify option addnumber, which allows you to specify an extra #, (#),
(##), ...

If I do the reshape option then it says that j(id) contains all msising variables

and also if i do:
ren (v*) (`wordlist')
oldnames and newnames do not match
You specified 1 pattern or name for oldname and 2 patterns or names for newnames.

Last edited by Eleanor Keeble; 28 Dec 2022, 11:43.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#12

28 Dec 2022, 12:07

Show us a sample of your dataset. The codes suggest that you have a variable named _cpc21_2 and variables named v1, v2, v3, and so on. So, you may copy and paste the result of

Code:

dataex _cpc21_2 v1-v5 in 1/5
1 like
Comment

Announcement

Question about xpose and sxpose?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment