Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • From time-events to a monthly count-table in STATA

    Hello everyone, I have a question I'm struggling to answer. I have a dataset with a number of workers. For each of them, I have their start date (beginning_date) and end date (stopwork_date) of service for different departments.
    I would like to obtain a table where I can see, for each month, the number of workers present.

    I have more than 3000 workers. The first 4 records are just like the following:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id department) str10(beginning_date stopwork_date)
    1 1 "23/01/2023" "24/07/2023"
    2 1 "14/04/2023" "21/06/2023"
    3 2 "12/02/2023" "12/12/2023"
    4 2 "17/03/2023" "15/09/2023"
    end
    I'm trying to have an output just like this, where I have the number of persons at work for each month:
    department January2023 February2023 March2023
    1 345 367 355
    2 234 222 220
    3 123 134 130
    4 12 14 14
    5 245 248 247

    Does anyone have a solution?


  • #2
    Here is one solution. I draw short of a final reshape wide, which I don't advise.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id department) str10(beginning_date stopwork_date)
    1 1 "23/01/2023" "24/07/2023"
    2 1 "14/04/2023" "21/06/2023"
    3 2 "12/02/2023" "12/12/2023"
    4 2 "17/03/2023" "15/09/2023"
    end
    
    rename (beginning_date stopwork_date) (date1 date2)
    
    reshape long date, i(id) j(event)
    
    gen mdate = mofd(daily(date, "DMY"))
    
    format mdate %tm 
    
    drop department date 
    
    duplicates drop id mdate, force 
    
    bysort id (mdate) : gen toexpand = cond(_n == 1, 1, mdate[2] - mdate[1])  
    
    expand toexpand 
    
    bysort id (mdate) : replace mdate = mdate[1] + _n - 1 
    
    list, sepby(id)
    
    bysort mdate : gen wanted = _N 
    
    list, sepby(mdate)

    Code:
    . * Example generated by -dataex-. For more info, type help dataex
    . clear
    
    . input byte(id department) str10(beginning_date stopwork_date)
    
               id  depart~t  beginnin~e  stopwork~e
      1. 1 1 "23/01/2023" "24/07/2023"
      2. 2 1 "14/04/2023" "21/06/2023"
      3. 3 2 "12/02/2023" "12/12/2023"
      4. 4 2 "17/03/2023" "15/09/2023"
      5. end
    
    . 
    . rename (beginning_date stopwork_date) (date1 date2)
    
    . 
    . reshape long date, i(id) j(event)
    (j = 1 2)
    
    Data                               Wide   ->   Long
    -----------------------------------------------------------------------------
    Number of observations                4   ->   8           
    Number of variables                   4   ->   4           
    j variable (2 values)                     ->   event
    xij variables:
                                date1 date2   ->   date
    -----------------------------------------------------------------------------
    
    . 
    . gen mdate = mofd(daily(date, "DMY"))
    
    . 
    . format mdate %tm 
    
    . 
    . drop department date 
    
    . 
    . duplicates drop id mdate, force 
    
    Duplicates in terms of id mdate
    
    (0 observations are duplicates)
    
    . 
    . bysort id (mdate) : gen toexpand = cond(_n == 1, 1, mdate[2] - mdate[1])  
    
    . 
    . expand toexpand 
    (20 observations created)
    
    . 
    . bysort id (mdate) : replace mdate = mdate[1] + _n - 1 
    (20 real changes made)
    
    . 
    . list, sepby(id)
    
         +---------------------------------+
         | id   event     mdate   toexpand |
         |---------------------------------|
      1. |  1       1    2023m1          1 |
      2. |  1       2    2023m2          6 |
      3. |  1       2    2023m3          6 |
      4. |  1       2    2023m4          6 |
      5. |  1       2    2023m5          6 |
      6. |  1       2    2023m6          6 |
      7. |  1       2    2023m7          6 |
         |---------------------------------|
      8. |  2       1    2023m4          1 |
      9. |  2       2    2023m5          2 |
     10. |  2       2    2023m6          2 |
         |---------------------------------|
     11. |  3       1    2023m2          1 |
     12. |  3       2    2023m3         10 |
     13. |  3       2    2023m4         10 |
     14. |  3       2    2023m5         10 |
     15. |  3       2    2023m6         10 |
     16. |  3       2    2023m7         10 |
     17. |  3       2    2023m8         10 |
     18. |  3       2    2023m9         10 |
     19. |  3       2   2023m10         10 |
     20. |  3       2   2023m11         10 |
     21. |  3       2   2023m12         10 |
         |---------------------------------|
     22. |  4       1    2023m3          1 |
     23. |  4       2    2023m4          6 |
     24. |  4       2    2023m5          6 |
     25. |  4       2    2023m6          6 |
     26. |  4       2    2023m7          6 |
     27. |  4       2    2023m8          6 |
     28. |  4       2    2023m9          6 |
         +---------------------------------+
    
    . 
    . bysort mdate : gen wanted = _N 
    
    . 
    . list, sepby(mdate)
    
         +------------------------------------------+
         | id   event     mdate   toexpand   wanted |
         |------------------------------------------|
      1. |  1       1    2023m1          1        1 |
         |------------------------------------------|
      2. |  1       2    2023m2          6        2 |
      3. |  3       1    2023m2          1        2 |
         |------------------------------------------|
      4. |  3       2    2023m3         10        3 |
      5. |  1       2    2023m3          6        3 |
      6. |  4       1    2023m3          1        3 |
         |------------------------------------------|
      7. |  4       2    2023m4          6        4 |
      8. |  1       2    2023m4          6        4 |
      9. |  3       2    2023m4         10        4 |
     10. |  2       1    2023m4          1        4 |
         |------------------------------------------|
     11. |  3       2    2023m5         10        4 |
     12. |  4       2    2023m5          6        4 |
     13. |  2       2    2023m5          2        4 |
     14. |  1       2    2023m5          6        4 |
         |------------------------------------------|
     15. |  4       2    2023m6          6        4 |
     16. |  1       2    2023m6          6        4 |
     17. |  3       2    2023m6         10        4 |
     18. |  2       2    2023m6          2        4 |
         |------------------------------------------|
     19. |  1       2    2023m7          6        3 |
     20. |  4       2    2023m7          6        3 |
     21. |  3       2    2023m7         10        3 |
         |------------------------------------------|
     22. |  3       2    2023m8         10        2 |
     23. |  4       2    2023m8          6        2 |
         |------------------------------------------|
     24. |  3       2    2023m9         10        2 |
     25. |  4       2    2023m9          6        2 |
         |------------------------------------------|
     26. |  3       2   2023m10         10        1 |
         |------------------------------------------|
     27. |  3       2   2023m11         10        1 |
         |------------------------------------------|
     28. |  3       2   2023m12         10        1 |
         +------------------------------------------+

    Comment


    • #3
      Thank you very much Nick Cox !!!

      Comment

      Working...
      X