Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about data management with panels

    Hi all,

    Sometimes I have to deal with panel data with several cross-section units (i=1..., I), time series (y,1...., Y), and variables (v1,.....vN).
    When processing the data, it is common for me to consolidate the data (i.e., calculations that include grouping cross-section units for a specific time). I do it storing the results using scalars inside loops. If instead of scalars, I would use variables, the problem is that values of the calculations are repeated along the time-series dimension, instead of just one single value. However, when I have to retrieve the values from the scalars after closing the Stata session, I should rerun the commands because, as you know scalars are only for each session.

    Here my question, do you have any advice on how to deal (what kind of storage option do you recommend to use) when you are going from one dimension of the panel to another and you need to keep the results for later?

    Thank you very much.

  • #2
    You should store the results in variables rather than in scalars, and then use if clauses to subset the data appropriately for the calculations you do subsequently.
    Code:
    . * Example generated by -dataex-. To install: ssc install dataex
    . clear
    
    . input float(id wave x)
    
                id       wave          x
      1. 1 1  6
      2. 1 2  5
      3. 2 1  3
      4. 2 2  1
      5. 3 1  6
      6. 3 2  2
      7. 4 1  6
      8. 4 2  9
      9. 5 1  4
     10. 5 2 10
     11. end
    
    . 
    . egen x_id = sum(x), by(id)
    
    . egen x_wave = sum(x), by(wave)
    
    . 
    . sort id
    
    . list id wave x x_id, sepby(id)
    
         +-----------------------+
         | id   wave    x   x_id |
         |-----------------------|
      1. |  1      1    6     11 |
      2. |  1      2    5     11 |
         |-----------------------|
      3. |  2      1    3      4 |
      4. |  2      2    1      4 |
         |-----------------------|
      5. |  3      1    6      8 |
      6. |  3      2    2      8 |
         |-----------------------|
      7. |  4      1    6     15 |
      8. |  4      2    9     15 |
         |-----------------------|
      9. |  5      1    4     14 |
     10. |  5      2   10     14 |
         +-----------------------+
    
    . sort wave
    
    . list id wave x x_wave, sepby(wave)
    
         +-------------------------+
         | id   wave    x   x_wave |
         |-------------------------|
      1. |  3      1    6       25 |
      2. |  1      1    6       25 |
      3. |  4      1    6       25 |
      4. |  2      1    3       25 |
      5. |  5      1    4       25 |
         |-------------------------|
      6. |  3      2    2       27 |
      7. |  2      2    1       27 |
      8. |  4      2    9       27 |
      9. |  1      2    5       27 |
     10. |  5      2   10       27 |
         +-------------------------+
    
    . 
    . tab x_id if wave==1
    
           x_id |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              4 |          1       20.00       20.00
              8 |          1       20.00       40.00
             11 |          1       20.00       60.00
             14 |          1       20.00       80.00
             15 |          1       20.00      100.00
    ------------+-----------------------------------
          Total |          5      100.00
    
    . tab x_wave if id==1
    
         x_wave |      Freq.     Percent        Cum.
    ------------+-----------------------------------
             25 |          1       50.00       50.00
             27 |          1       50.00      100.00
    ------------+-----------------------------------
          Total |          2      100.00
    
    .

    Comment


    • #3
      Thank you William!

      Comment

      Working...
      X