Reshape specific variables

Nicolas Charette

Join Date: Aug 2022
Posts: 23

Reshape specific variables

12 Aug 2022, 08:01

Hello,

I am using Stata 17.0/BE for Mac and I’m trying to reshape my data for a panel. I want the following:

put all the _course* variables as observations under a new variable (newvar1);
all the observations for _course* variables to be under a new variable (newvar2);
all the observations for variables year and stuff* to adjust

This is what I have:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int(id year) double stuff int stufff2 byte(_course1 _course2 _course3 _course4 _course5)
333 2010 6  45 83 90  .  .  .
333 2011 6  16  .  .  .  .  .
333 2011 7 117  .  . 84 66 73
333 2012 8 117  .  . 83  . 83
end

And this is what I want it to look like after the reshape:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int id str8 newvar1 int year double stuff int stufff2 byte _newvar2
333 "_course1" 2010 6  45 83
333 "_course2" 2010 6  45 90
333 "."        2011 6  16  .
333 "_course3" 2011 7 117 84
333 "_course4" 2011 7 117 66
333 "_course5" 2011 7 117 73
333 "_course3" 2012 8 117 83
333 "_course5" 2012 8 117 83
end

Thank you for your help,

Nicolas Charette

Tags: panel, panel data, reshape

Hemanshu Kumar

Join Date: Mar 2015

Posts: 1400
#2

12 Aug 2022, 08:27

Consider this:

Code:

reshape long _course , i(id year stuff*) j(course_num) gen newvar1 = "_course" + string(course_num) drop course_num rename _course newvar2 drop if missing(newvar2)

It doesn't have the third observation in your required dataset; is that a problem?
2 likes
Comment
Tim Cheney

Join Date: Sep 2021

Posts: 12
#3

12 Aug 2022, 08:35

Maybe easier to use expand command and then apply logic to recode to what you want
1 like
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

12 Aug 2022, 09:23

This builds on the code in post #2 and creates the single observation when all values of _course are missing.

Code:

generate seq = _n reshape long _course , i(seq) j(newvar1) string replace newvar1 = "_course" + newvar1 rename _course newvar2 sort id year seq // keep one obs when all values of newvar2 are mssing by id year seq: egen c = count(newvar2) by id year seq: replace newvar1 = "" if c==0 & _n==1 drop c seq drop if newvar2==. & newvar1!=""

Code:

* Example generated by -dataex-. For more info, type help dataex clear input str8 newvar1 int(id year) double stuff int stufff2 byte newvar2 "_course1" 333 2010 6 45 83 "_course2" 333 2010 6 45 90 "" 333 2011 6 16 . "_course3" 333 2011 7 117 84 "_course4" 333 2011 7 117 66 "_course5" 333 2011 7 117 73 "_course3" 333 2012 8 117 83 "_course5" 333 2012 8 117 83 end

A few comments.

Since I needed a sequence number to identify the original observations in the long dataset, I used it rather than id and year and stuff and stuff2 to identify the observations in the reshaped dataset. Also, it is not clear that there cannot be two observations with the same values of id, year, stuff, and stuff2, in which case the reshape long i(id year stuff*) would fail.

I preserve the suffix 1-5 as a string so I can just use string concatenation to turn it back into _course1 - course5.

To create the single observation with missing values when all values of _course are missing, I count the number of non-missing values and if the count is zero, replace newvar1 with "" (which is Stata's string missing value), and then don't drop those observations.

Added in edit: the code below produces the same result, perhaps more simply.

Code:

generate seq = _n egen c = rownonmiss(_course*) reshape long _course , i(seq) j(newvar1) string replace newvar1 = "_course" + newvar1 rename _course newvar2 // keep one obs when all values of newvar2 are mssing replace newvar1 = "" if c==0 & newvar1=="_course1" drop c seq drop if newvar2==. & newvar1!=""

Last edited by William Lisowski; 12 Aug 2022, 09:27.
1 like
Comment
Nicolas Charette

Join Date: Aug 2022

Posts: 23
#5

14 Aug 2022, 06:57

Thank you very much Hemanshu! It worked. Thanks for added code as well, William!
Comment

Announcement

Reshape specific variables

Comment

Comment

Comment

Comment