Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding to a monthly panel data information on last & next dates from another panel

    I have information of ids from two sources: (1) a monthly purchase panel, (2) an yearly survey (with info on the exact year/month of interview). I would like to add to the monthly purchase data information on the last & next survey interview date (year/month).

    Consider the toy datasets below. The first one contains info on every month individual "A" has purchased. The second one informs at which year/month individual "A" has been interviewed over the years. Note that we may have purchase data of individuals before they were first interviewed (the first five rows in the purchase data) or after they are last interviewed. We also may have cases where the earliest survey interview occurred before the first purchase or cases where the lastest interview occurred after the last purchase (the last row in the survey data).
    Code:
    clear
    input id year month
    "A" 2008 1
    "A" 2008 2
    "A" 2008 3
    "A" 2008 4
    "A" 2008 5
    "A" 2008 6
    "A" 2008 7
    "A" 2008 8
    "A" 2008 9
    "A" 2008 10
    "A" 2008 11
    "A" 2008 12
    "A" 2009 1
    "A" 2009 2
    "A" 2009 3
    "A" 2009 4
    "A" 2009 5
    "A" 2009 6
    "A" 2009 7
    "A" 2009 8
    "A" 2009 9
    "A" 2009 10
    "A" 2009 11
    "A" 2009 12
    end
    save purchase_panel, replace
    
    clear 
    input str1 id survey_yr survey_mo
    "A" 2008 6
    "A" 2009 6
    "A" 2010 6
    end
    save survey_ym, replace
    The desired output is as below. Take the first observation in the purchase data. Because there is no previous interview date (in the survey data) relative to the purchase year/month of Jan/2008 we have missing values for the new variables `last_survey_yr` & `last_survey_mo`, whereas the variables `next_survey_yr` & `next_survey_mo` variables are filled with the next interview date of Jun/2006 (first row in the survey data).
    Code:
    clear
    input str1 id year month last_survey_yr last_survey_mo next_survey_yr next_survey_mo
    "A" 2008 1 . . 2008 6
    "A" 2008 2 . . 2008 6
    "A" 2008 3 . . 2008 6
    "A" 2008 4 . . 2008 6
    "A" 2008 5 . . 2008 6
    "A" 2008 6 2008 6 2008 6
    "A" 2008 7 2008 6 2009 6
    "A" 2008 8 2008 6 2009 6
    "A" 2008 9 2008 6 2009 6
    "A" 2008 10 2008 6 2009 6
    "A" 2008 11 2008 6 2009 6
    "A" 2008 12 2008 6 2009 6
    "A" 2009 1 2008 6 2009 6
    "A" 2009 2 2008 6 2009 6
    "A" 2009 3 2008 6 2009 6
    "A" 2009 4 2008 6 2009 6
    "A" 2009 5 2008 6 2009 6
    "A" 2009 6 2009 6 2009 6
    "A" 2009 7 2009 6 2010 6
    "A" 2009 8 2009 6 2010 6
    "A" 2009 9 2009 6 2010 6
    "A" 2009 10 2009 6 2010 6
    "A" 2009 11 2009 6 2010 6
    "A" 2009 12 2009 6 2010 6
    end




  • #2
    I think I managed to write a code that solves the problem (probably not the most efficient though).

    PS: in the first dataset I forgot to specify str1 when inputting id - I corrected it below.

    The data
    Code:
    clear
    input str1 id year month
    "A" 2008 1
    "A" 2008 2
    "A" 2008 3
    "A" 2008 4
    "A" 2008 5
    "A" 2008 6
    "A" 2008 7
    "A" 2008 8
    "A" 2008 9
    "A" 2008 10
    "A" 2008 11
    "A" 2008 12
    "A" 2009 1
    "A" 2009 2
    "A" 2009 3
    "A" 2009 4
    "A" 2009 5
    "A" 2009 6
    "A" 2009 7
    "A" 2009 8
    "A" 2009 9
    "A" 2009 10
    "A" 2009 11
    "A" 2009 12
    end
    save purchase_panel, replace
    
    clear 
    input str1 id survey_yr survey_mo
    "A" 2008 6
    "A" 2009 6
    "A" 2010 6
    "A" 2011 6
    end
    save survey_ym, replace
    The code
    Code:
    use purchase_panel, clear
    gen ym = ym(year, month)
    
    gen survey_yr = year
    merge m:1 id survey_yr using survey_ym, keep(master match) 
    gen exact = ym(survey_yr, survey_mo) if _merge==3
    drop survey_yr survey_mo _merge
    
    gen survey_yr = year - 1
    merge m:1 id survey_yr using survey_ym, keep(master match)
    gen last = ym(survey_yr, survey_mo) if _merge==3
    drop survey_yr survey_mo _merge
    
    gen survey_yr = year + 1
    merge m:1 id survey_yr using survey_ym, keep(master match)
    gen next = ym(survey_yr, survey_mo) if _merge==3
    drop survey_yr survey_mo _merge
    
    gen last_survey_ym = last if exact >= ym | exact < ym & abs(ym - last) < abs(ym - exact)
    replace last_survey_ym = exact if exact <= ym & abs(ym - last) > abs(ym - exact)
    
    gen next_survey_ym = next if exact <= ym | exact > ym & abs(ym - next) < abs(ym - exact)
    replace next_survey_ym = exact if exact >= ym & abs(ym - next) > abs(ym - exact)
    
    format ym last_survey_ym next_survey_ym %tm
    drop exact last next
    The output
    Code:
    . list id ym last_survey_ym next_survey_ym, sep(0)
    
         +------------------------------------+
         | id        ym   last_s~m   next_s~m |
         |------------------------------------|
      1. |  A    2008m1          .     2008m6 |
      2. |  A    2008m2          .     2008m6 |
      3. |  A    2008m3          .     2008m6 |
      4. |  A    2008m4          .     2008m6 |
      5. |  A    2008m5          .     2008m6 |
      6. |  A    2008m6     2008m6     2008m6 |
      7. |  A    2008m7     2008m6     2009m6 |
      8. |  A    2008m8     2008m6     2009m6 |
      9. |  A    2008m9     2008m6     2009m6 |
     10. |  A   2008m10     2008m6     2009m6 |
     11. |  A   2008m11     2008m6     2009m6 |
     12. |  A   2008m12     2008m6     2009m6 |
     13. |  A    2009m1     2008m6     2009m6 |
     14. |  A    2009m2     2008m6     2009m6 |
     15. |  A    2009m3     2008m6     2009m6 |
     16. |  A    2009m4     2008m6     2009m6 |
     17. |  A    2009m5     2008m6     2009m6 |
     18. |  A    2009m6     2009m6     2009m6 |
     19. |  A    2009m7     2009m6     2010m6 |
     20. |  A    2009m8     2009m6     2010m6 |
     21. |  A    2009m9     2009m6     2010m6 |
     22. |  A   2009m10     2009m6     2010m6 |
     23. |  A   2009m11     2009m6     2010m6 |
     24. |  A   2009m12     2009m6     2010m6 |
         +------------------------------------+

    Comment

    Working...
    X