Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem reshaping an entire dataset

    Hi there. I'm new here and I want to discuss a problem with my dataset. I'm using Stata 16.1.

    I have the following dataset, which I want to reshape from wide to long to use it as panel data. The problem I have right now is that I don't know how to completely reshape it, given that the dataset has more than 100 variables for each year. All variables included here end up with the tag "_w01-_w02-_w03-_w04" as I show in the following example:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double idencuesta int(c01_w01 c01_w02 c01_w03 c01_w04 t01_w01 t01_w02 t01_w03 t01_w04)
    1101011    1 2    1    1    1 1    5 4
    1101012    1 1    3    2    3 4    4 4
    1101013    1 1    5    2    3 1    4 3
    1101021    1 3    1    2    3 3 -888 3
    1101022    2 2    .    .    2 4    . .
    1101023    1 4    1    1    2 3    1 1
    1101032    3 1    1    2    3 4    3 4
    1101033    3 3    2    1    3 3    3 3
    1101041    1 1    1    1    2 2    2 2
    I'm not sure how to proceed with reshaping all variables at once, can you help me? Thanks in advance.

  • #2
    This should start you in a useful direction.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double idencuesta int(c01_w01 c01_w02 c01_w03 c01_w04 t01_w01 t01_w02 t01_w03 t01_w04)
    1101011    1 2    1    1    1 1    5 4
    1101012    1 1    3    2    3 4    4 4
    1101013    1 1    5    2    3 1    4 3
    1101021    1 3    1    2    3 3 -888 3
    1101022    2 2    .    .    2 4    . .
    1101023    1 4    1    1    2 3    1 1
    1101032    3 1    1    2    3 4    3 4
    1101033    3 3    2    1    3 3    3 3
    1101041    1 1    1    1    2 2    2 2
    end
    // build list of stubs
    local stubs
    foreach v of varlist *_w* {
        local s = ustrregexra("`v'", "_w.*$", "")
        local stubs : list stubs | s
    }
    macro list _stubs
    reshape long `stubs', i(idencuesta) j(w) string
    destring w, replace ignore("w_")
    list in 1/8, sepby(idencuesta)
    Code:
    . macro list _stubs
    _stubs:         c01 t01
    
    . reshape long `stubs', i(idencuesta) j(w) string
    (j = _w01 _w02 _w03 _w04)
    
    Data                               Wide   ->   Long
    -----------------------------------------------------------------------------
    Number of observations                9   ->   36          
    Number of variables                   9   ->   4           
    j variable (4 values)                     ->   w
    xij variables:
                c01_w01 c01_w02 ... c01_w04   ->   c01
                t01_w01 t01_w02 ... t01_w04   ->   t01
    -----------------------------------------------------------------------------
    
    . destring w, replace ignore("w_")
    w: characters w _ removed; replaced as byte
    
    . list in 1/8, sepby(idencuesta)
    
         +--------------------------+
         | idencu~a   w   c01   t01 |
         |--------------------------|
      1. |  1101011   1     1     1 |
      2. |  1101011   2     2     1 |
      3. |  1101011   3     1     5 |
      4. |  1101011   4     1     4 |
         |--------------------------|
      5. |  1101012   1     1     3 |
      6. |  1101012   2     1     4 |
      7. |  1101012   3     3     4 |
      8. |  1101012   4     2     4 |
         +--------------------------+
    
    .

    Comment


    • #3
      Like William Lisowski , I guessed that your primary problem (?) was a convenient way to list all the stubs connected to your w01-w04 suffixes. An alternative way to get that list, not shorter or better but perhaps nevertheless of interest, would be:
      Code:
      unab stubs: *_w0?
      local stubs =  ustrregexra("`stubs'", "w0.", "") // strip suffixes
      local stubs: list uniq stubs  // remove duplicates
      -help macrolists- describes some nice list features in Stata, which I always have to look up.

      Comment


      • #4
        Originally posted by Mike Lacy View Post
        Like William Lisowski , I guessed that your primary problem (?) was a convenient way to list all the stubs connected to your w01-w04 suffixes. An alternative way to get that list, not shorter or better but perhaps nevertheless of interest, would be:
        Code:
        unab stubs: *_w0?
        local stubs = ustrregexra("`stubs'", "w0.", "") // strip suffixes
        local stubs: list uniq stubs // remove duplicates
        -help macrolists- describes some nice list features in Stata, which I always have to look up.
        Yes, I wanted, as you said, a convinient way to list all the stubs. Thank you!

        Comment


        • #5
          Originally posted by William Lisowski View Post
          This should start you in a useful direction.
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input double idencuesta int(c01_w01 c01_w02 c01_w03 c01_w04 t01_w01 t01_w02 t01_w03 t01_w04)
          1101011 1 2 1 1 1 1 5 4
          1101012 1 1 3 2 3 4 4 4
          1101013 1 1 5 2 3 1 4 3
          1101021 1 3 1 2 3 3 -888 3
          1101022 2 2 . . 2 4 . .
          1101023 1 4 1 1 2 3 1 1
          1101032 3 1 1 2 3 4 3 4
          1101033 3 3 2 1 3 3 3 3
          1101041 1 1 1 1 2 2 2 2
          end
          // build list of stubs
          local stubs
          foreach v of varlist *_w* {
          local s = ustrregexra("`v'", "_w.*$", "")
          local stubs : list stubs | s
          }
          macro list _stubs
          reshape long `stubs', i(idencuesta) j(w) string
          destring w, replace ignore("w_")
          list in 1/8, sepby(idencuesta)
          Code:
          . macro list _stubs
          _stubs: c01 t01
          
          . reshape long `stubs', i(idencuesta) j(w) string
          (j = _w01 _w02 _w03 _w04)
          
          Data Wide -> Long
          -----------------------------------------------------------------------------
          Number of observations 9 -> 36
          Number of variables 9 -> 4
          j variable (4 values) -> w
          xij variables:
          c01_w01 c01_w02 ... c01_w04 -> c01
          t01_w01 t01_w02 ... t01_w04 -> t01
          -----------------------------------------------------------------------------
          
          . destring w, replace ignore("w_")
          w: characters w _ removed; replaced as byte
          
          . list in 1/8, sepby(idencuesta)
          
          +--------------------------+
          | idencu~a w c01 t01 |
          |--------------------------|
          1. | 1101011 1 1 1 |
          2. | 1101011 2 2 1 |
          3. | 1101011 3 1 5 |
          4. | 1101011 4 1 4 |
          |--------------------------|
          5. | 1101012 1 1 3 |
          6. | 1101012 2 1 4 |
          7. | 1101012 3 3 4 |
          8. | 1101012 4 2 4 |
          +--------------------------+
          
          .
          Thank you very much, it worked! I appreciate your help

          Comment

          Working...
          X