Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transform unbalanced to balanced

    This is an example of my data:
    Code:
    firm date      nhired
    2     198702 1
    2     198705 41
    2     198706 7
    10   198702 2
    10   198703 5
    10   198704 3
    10   198705 8
    195  .           .
    394  .           .
    Basically I have an unbalanced panel dataset and I want to transform it into a balanced panel dataset. This is what I want:
    Code:
    firm  date       nhired
    2      198702  1
    2      198703  0
    2      198704  0
    2      198705  41
    2      198706  7
    10    198702  2
    10    198703  5
    10    198704  3
    10    198705  8
    10    198706  0
    195  198702  0
    195  198703  0
    195  198704  0
    195  198705  0
    195  198706  0
    394  198702  0
    394  198703  0
    394  198704  0
    394  198705  0
    394  198706  0
    I tried to use "spbalance, balance" but it just dropped all observations. Does anyone have a suggestion about how to get there? I have more than 100 periods in my dataset.

  • #2
    80 posts in, you should be able to present a proper data example. See FAQ Advice #12 if you are still struggling with that.

    Code:
    clear
    input float(firm date nhired)
    2 198702 1
    2 198705 41
    2 198706 7
    10 198702 2
    10 198703 5
    10 198704 3
    10 198705 8
    195 . .
    394 . .
    end
    
    sort date
    replace date= date[1] if missing(date)
    xtset firm date
    tsfill, full
    Res.:

    Code:
    . l
    
         +------------------------+
         | firm     date   nhired |
         |------------------------|
      1. |    2   198702        1 |
      2. |    2   198703        . |
      3. |    2   198704        . |
      4. |    2   198705       41 |
      5. |    2   198706        7 |
         |------------------------|
      6. |   10   198702        2 |
      7. |   10   198703        5 |
      8. |   10   198704        3 |
      9. |   10   198705        8 |
     10. |   10   198706        . |
         |------------------------|
     11. |  195   198702        . |
     12. |  195   198703        . |
     13. |  195   198704        . |
     14. |  195   198705        . |
     15. |  195   198706        . |
         |------------------------|
     16. |  394   198702        . |
     17. |  394   198703        . |
     18. |  394   198704        . |
     19. |  394   198705        . |
     20. |  394   198706        . |
         +------------------------+
    
    .

    Comment


    • #3
      The implication of what you're asking for is that you know that absence from the dataset is to be treated as a zero and also that whatever commands you intend to use next will treat those zeros appropriately.

      Given that, Andrew Musau's very helpful code should be followed by

      Code:
      replace nhired = 0 if nhired == .

      Comment


      • #4
        Sorry for the late reply.
        Thank you Andrew Musau for your suggestion. The challenge in my dataset is that I have more than 100 periods, and when I write:
        Code:
        tsfill, full
        Stata gives me the information
        Code:
        sum of expand values exceed 2,147,483,620 observations
        The dataset may not contain more than 2,147,483,620 observations
        r(459)
        Is there a way to expand only with the periods in my data?

        Code:
        xtset firm date
        Code:
        Panel variable: firm (unbalanced)
        Time variable: date, 198504 to 202005, but with gaps
        Delta: 1 unit
        (in my time variable the four digits are the year, and the two last digits the month, and it is a numerical variable)

        Comment


        • #5
          Create a proper year-month date variable and xtset with this.

          Code:
          help datetime
          Code:
          clear
          input float(firm date nhired)
          2 198702 1
          2 198705 41
          2 198706 7
          10 198702 2
          10 198703 5
          10 198704 3
          10 198705 8
          195 . .
          394 . .
          end
          
          gen yearmonth= ym(int(date/100), date-(int(date/100)*100)), after(firm)
          format yearmonth %tm
          sort yearmonth
          replace yearmonth= yearmonth[1] if missing(yearmonth)
          xtset firm yearmonth 
          tsfill, full
          replace nhired = 0 if nhired == .
          Res.:

          Code:
          . l
          
               +-----------------------------------+
               | firm   yearmo~h     date   nhired |
               |-----------------------------------|
            1. |    2     1987m2   198702        1 |
            2. |    2     1987m3        .        0 |
            3. |    2     1987m4        .        0 |
            4. |    2     1987m5   198705       41 |
            5. |    2     1987m6   198706        7 |
               |-----------------------------------|
            6. |   10     1987m2   198702        2 |
            7. |   10     1987m3   198703        5 |
            8. |   10     1987m4   198704        3 |
            9. |   10     1987m5   198705        8 |
           10. |   10     1987m6        .        0 |
               |-----------------------------------|
           11. |  195     1987m2        .        0 |
           12. |  195     1987m3        .        0 |
           13. |  195     1987m4        .        0 |
           14. |  195     1987m5        .        0 |
           15. |  195     1987m6        .        0 |
               |-----------------------------------|
           16. |  394     1987m2        .        0 |
           17. |  394     1987m3        .        0 |
           18. |  394     1987m4        .        0 |
           19. |  394     1987m5        .        0 |
           20. |  394     1987m6        .        0 |
               +-----------------------------------+
          
          .

          Comment

          Working...
          X