Reshaping based on suffix not prefix

Eva Warren

Join Date: Feb 2022
Posts: 6

Reshaping based on suffix not prefix

02 Mar 2022, 11:49

Hey y'all! Apologies in advance if this has been asked, I ran a search and didn't find anything but please feel free to point me in the right direction if needed.

tl;dr: I have data in wide format that I need to reshape into long. The variable names vary by prefix and not by suffix. AFAIK, reshape appears to require static prefixes for the stub and variable suffixes for j. Is there a way to run reshape without having to relabel all my variables?

Longer explanation:

I have data from a randomized vignette survey. Participants answered the same questions about 9 out of 54 possible vignettes. The dataset labels the variables by vignette and then by content, so my variables look like "A1impact A1intent B1impact B1intent A2impact A2intent B2impact B2intent." I would like to reshape from wide to long where "intent" and "impact" are my new variables with rows for A1, A2, B1 and B2. Is this possible? I know my i is PID (participant ID) and my instinct is to make j something like *impact. The closest option I found to that would be @impact but I've run into errors with all of those approaches. Any advice?

Details:
I get the following error when I try @@@outcome or *outcome or ???outcome

Code:

. reshape long @@@outcome @@@convince @@@intent @@@impact @@@difftreat @@@power @@@realistic @@@wrong, i(PID) j(vignette)
no xij variables found
    You typed something like reshape wide a b, i(i) j(j).
    reshape looked for existing variables named a# and b# but could not find any.  Remember this picture:

         long                                wide
        +---------------+                   +------------------+
        | i   j   a   b |                   | i   a1 a2  b1 b2 |
        |---------------| <--- reshape ---> |------------------|
        | 1   1   1   2 |                   | 1   1   3   2  4 |
        | 1   2   3   4 |                   | 2   5   7   6  8 |
        | 2   1   5   6 |                   +------------------+
        | 2   2   7   8 |
        +---------------+

        long to wide: reshape wide a b, i(i) j(j)    (j existing variable)
        wide to long: reshape long a b, i(i) j(j)    (j    new   variable)

I get the following error when I try @outcome

Code:

. reshape long @outcome @convince @intent @impact @difftreat @power @realistic @wrong, i(PID) j(vignette)
variable vignette contains all missing values
r(498);

When I try to drop vignette it tells me the variable cannot be found.

Existing Data Structure

PID	S1Aimpact	S1Aintent	S1Bimpact	S1Bintent	S2Aimpact	S2Aintent	S2Bimpact	S2Bintent	R1Aimpact	R1Aintent
1	90	50			100	100
2			75	100	20	20			90	100
3	50	75					20	0	100	50
4	20	100					90	0

Ideal Data Structure

PID	impact	intent	vignette
1	90	50	S1A
1	100	100	S2A
2	75	100	S1B
2	20	20	S2A
2	90	100	R1A
3	50	75	S1A
3	20	0	S2B
3	100	50	R1A
4	20	100	S1A
4	90	0	S2B

Other helpful info: I'm running v16.1 on a Windows 10 machine.

ETA: Unfortunately the data is confidential and cannot be shared. Please let me know if I can make the example data more helpful! All of the data is numerical with associated labels as appropriate.

Last edited by Eva Warren; 02 Mar 2022, 12:08.

Tags: reshape

Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

02 Mar 2022, 12:07

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte(pid s1aimpact s1aintent s1bimpact s1bintent s2aimpact s2aintent s2bimpact s2bintent r1aimpact r1aintent) 1 90 50 . . 100 100 . . . . 2 . . 75 100 20 20 . . 90 100 3 50 75 . . . . 20 0 100 50 4 20 100 . . . . 90 0 . . end reshape long @impact @intent, i(pid) j(vignette) string drop if missing(impact) & missing(intent)

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
3 likes
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

02 Mar 2022, 12:12

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(pid s1aimpact s1aintent s1bimpact s1bintent s2aimpact s2aintent s2bimpact s2bintent r1aimpact r1aintent)
1 90  50  .   . 100 100  . .   .   .
2  .   . 75 100  20  20  . .  90 100
3 50  75  .   .   .   . 20 0 100  50
4 20 100  .   .   .   . 90 0   .   .
end
reshape long @impact @intent, i(pid) j(vignette) string
drop if impact==. & intent==.
list, clean

Code:

. reshape long @impact @intent, i(pid) j(vignette) string
(j = r1a s1a s1b s2a s2b)

Data                               Wide   ->   Long
-----------------------------------------------------------------------------
Number of observations                4   ->   20          
Number of variables                  11   ->   4          
j variable (5 values)                     ->   vignette
xij variables:
      r1aimpact s1aimpact ... s2bimpact   ->   impact
      r1aintent s1aintent ... s2bintent   ->   intent
-----------------------------------------------------------------------------

. drop if impact==. & intent==.
(10 observations deleted)

. list, clean

       pid   vignette   impact   intent  
  1.     1        s1a       90       50  
  2.     1        s2a      100      100  
  3.     2        r1a       90      100  
  4.     2        s1b       75      100  
  5.     2        s2a       20       20  
  6.     3        r1a      100       50  
  7.     3        s1a       50       75  
  8.     3        s2b       20        0  
  9.     4        s1a       20      100  
 10.     4        s2b       90        0  

.

Added in edit: Clyde and I have simultaneously posted similar answers in the past, but never before have we posted identical code.

Last edited by William Lisowski; 02 Mar 2022, 12:17.

Comment

Eva Warren

Join Date: Feb 2022

Posts: 6
#4

02 Mar 2022, 12:12

AMAZING! Thank you so much! Quick question about -dataex-: I went the table route because our data is confidential and can't be shared. In future similar cases, should I just create a mock .dta sample and then use -dataex-?
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

02 Mar 2022, 12:14

Exactly. This is explained in the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#6

02 Mar 2022, 12:15

In future similar cases, should I just create a mock .dta sample and then use -dataex-?

YES!!! Just be sure that the mock example exhibits the same problems/questions as the original data.
1 like
Comment
Eva Warren

Join Date: Feb 2022

Posts: 6
#7

02 Mar 2022, 12:19

Will do! Thank you both for your help. You've save me many hours of frustration.
Comment

Announcement

Reshaping based on suffix not prefix

Comment

Comment

Comment

Comment

Comment

Comment