Problem reshaping an entire dataset

Albert Eriksson

Join Date: Jan 2022

Posts: 3
#1

Problem reshaping an entire dataset

03 Jan 2022, 08:42

Hi there. I'm new here and I want to discuss a problem with my dataset. I'm using Stata 16.1.

I have the following dataset, which I want to reshape from wide to long to use it as panel data. The problem I have right now is that I don't know how to completely reshape it, given that the dataset has more than 100 variables for each year. All variables included here end up with the tag "_w01-_w02-_w03-_w04" as I show in the following example:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input double idencuesta int(c01_w01 c01_w02 c01_w03 c01_w04 t01_w01 t01_w02 t01_w03 t01_w04) 1101011 1 2 1 1 1 1 5 4 1101012 1 1 3 2 3 4 4 4 1101013 1 1 5 2 3 1 4 3 1101021 1 3 1 2 3 3 -888 3 1101022 2 2 . . 2 4 . . 1101023 1 4 1 1 2 3 1 1 1101032 3 1 1 2 3 4 3 4 1101033 3 3 2 1 3 3 3 3 1101041 1 1 1 1 2 2 2 2

I'm not sure how to proceed with reshaping all variables at once, can you help me? Thanks in advance.
Tags: None

William Lisowski

Join Date: Dec 2014
Posts: 10150

03 Jan 2022, 09:12

This should start you in a useful direction.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double idencuesta int(c01_w01 c01_w02 c01_w03 c01_w04 t01_w01 t01_w02 t01_w03 t01_w04)
1101011    1 2    1    1    1 1    5 4
1101012    1 1    3    2    3 4    4 4
1101013    1 1    5    2    3 1    4 3
1101021    1 3    1    2    3 3 -888 3
1101022    2 2    .    .    2 4    . .
1101023    1 4    1    1    2 3    1 1
1101032    3 1    1    2    3 4    3 4
1101033    3 3    2    1    3 3    3 3
1101041    1 1    1    1    2 2    2 2
end
// build list of stubs
local stubs
foreach v of varlist *_w* {
    local s = ustrregexra("`v'", "_w.*$", "")
    local stubs : list stubs | s
}
macro list _stubs
reshape long `stubs', i(idencuesta) j(w) string
destring w, replace ignore("w_")
list in 1/8, sepby(idencuesta)

Code:

. macro list _stubs
_stubs:         c01 t01

. reshape long `stubs', i(idencuesta) j(w) string
(j = _w01 _w02 _w03 _w04)

Data                               Wide   ->   Long
-----------------------------------------------------------------------------
Number of observations                9   ->   36          
Number of variables                   9   ->   4           
j variable (4 values)                     ->   w
xij variables:
            c01_w01 c01_w02 ... c01_w04   ->   c01
            t01_w01 t01_w02 ... t01_w04   ->   t01
-----------------------------------------------------------------------------

. destring w, replace ignore("w_")
w: characters w _ removed; replaced as byte

. list in 1/8, sepby(idencuesta)

     +--------------------------+
     | idencu~a   w   c01   t01 |
     |--------------------------|
  1. |  1101011   1     1     1 |
  2. |  1101011   2     2     1 |
  3. |  1101011   3     1     5 |
  4. |  1101011   4     1     4 |
     |--------------------------|
  5. |  1101012   1     1     3 |
  6. |  1101012   2     1     4 |
  7. |  1101012   3     3     4 |
  8. |  1101012   4     2     4 |
     +--------------------------+

.

Comment

Mike Lacy

Join Date: Apr 2014

Posts: 2416
#3

03 Jan 2022, 09:42

Like William Lisowski , I guessed that your primary problem (?) was a convenient way to list all the stubs connected to your w01-w04 suffixes. An alternative way to get that list, not shorter or better but perhaps nevertheless of interest, would be:

Code:

unab stubs: *_w0? local stubs = ustrregexra("`stubs'", "w0.", "") // strip suffixes local stubs: list uniq stubs // remove duplicates

-help macrolists- describes some nice list features in Stata, which I always have to look up.
1 like
Comment
Albert Eriksson

Join Date: Jan 2022

Posts: 3
#4

03 Jan 2022, 10:09

Originally posted by Mike Lacy View Post

Like William Lisowski , I guessed that your primary problem (?) was a convenient way to list all the stubs connected to your w01-w04 suffixes. An alternative way to get that list, not shorter or better but perhaps nevertheless of interest, would be:

Code:

unab stubs: *_w0? local stubs = ustrregexra("`stubs'", "w0.", "") // strip suffixes local stubs: list uniq stubs // remove duplicates

-help macrolists- describes some nice list features in Stata, which I always have to look up.

Yes, I wanted, as you said, a convinient way to list all the stubs. Thank you!
Comment

Albert Eriksson

Join Date: Jan 2022
Posts: 3

03 Jan 2022, 10:10

Originally posted by William Lisowski View Post

This should start you in a useful direction.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double idencuesta int(c01_w01 c01_w02 c01_w03 c01_w04 t01_w01 t01_w02 t01_w03 t01_w04)
1101011 1 2 1 1 1 1 5 4
1101012 1 1 3 2 3 4 4 4
1101013 1 1 5 2 3 1 4 3
1101021 1 3 1 2 3 3 -888 3
1101022 2 2 . . 2 4 . .
1101023 1 4 1 1 2 3 1 1
1101032 3 1 1 2 3 4 3 4
1101033 3 3 2 1 3 3 3 3
1101041 1 1 1 1 2 2 2 2
end
// build list of stubs
local stubs
foreach v of varlist *_w* {
local s = ustrregexra("`v'", "_w.*$", "")
local stubs : list stubs | s
}
macro list _stubs
reshape long `stubs', i(idencuesta) j(w) string
destring w, replace ignore("w_")
list in 1/8, sepby(idencuesta)

Code:

. macro list _stubs
_stubs: c01 t01

. reshape long `stubs', i(idencuesta) j(w) string
(j = _w01 _w02 _w03 _w04)

Data Wide -> Long
-----------------------------------------------------------------------------
Number of observations 9 -> 36
Number of variables 9 -> 4
j variable (4 values) -> w
xij variables:
c01_w01 c01_w02 ... c01_w04 -> c01
t01_w01 t01_w02 ... t01_w04 -> t01
-----------------------------------------------------------------------------

. destring w, replace ignore("w_")
w: characters w _ removed; replaced as byte

. list in 1/8, sepby(idencuesta)

+--------------------------+
| idencu~a w c01 t01 |
|--------------------------|
1. | 1101011 1 1 1 |
2. | 1101011 2 2 1 |
3. | 1101011 3 1 5 |
4. | 1101011 4 1 4 |
|--------------------------|
5. | 1101012 1 1 3 |
6. | 1101012 2 1 4 |
7. | 1101012 3 3 4 |
8. | 1101012 4 2 4 |
+--------------------------+

.

Thank you very much, it worked! I appreciate your help

Announcement

Problem reshaping an entire dataset

Comment

Comment

Comment

Comment