Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating observations with missing value by individual ID

    Hi all,

    I have an unbalanced panel with IDs, years and corresponding outcome values for ID-year combinations. I want to make this a balanced panel, by including the same number of years for each individual and for the years that I add, generate values equal to 0 for the outcomes. For example, I want 12 years of observations for each individual, from 2011-2022. So if the first individual in my dataset only has values for 2012 and 2014, I want to create additional years 2011, 2013, and 2015-2022, and then assign 0 values for the outcome for these years. Similarly, for the 2nd individual I want to create 2011, 2014, 2017, 2020, 2022 and assign an outcome of 0 for these years.

    I'm unable to share the actual data but the setup is similar to below:
    id year outcome
    1 2012 0
    1 2014 1
    2 2012 1
    2 2013 1
    2 2015 0
    2 2016 0
    2 2018 0
    2 2019 1
    2 2021 1
    Is there a way to achieve this in Stata?

    Thank you very much.

  • #2
    See https://www.statalist.org/forums/for...ed-to-balanced which asks essentially the same question.

    Comment


    • #3
      Thanks Nick. In this thread, the first step is something like this:

      Code:
      replace year= year[1] if missing(year)
      But in my dataset the missing years are not indicated or appear in the data. So I first need to generate years from 2011-2022 for each id and indicate the missing years for each id. How would I do that? For eg id 1 has only 2012 and 2014. I would I add the missing years?

      Comment


      • #4
        See also

        Code:
        help fillin
        https://www.stata-journal.com/articl...article=dm0011

        So, add 12 pseudo-observations for id -1 (say) and years 2011-2022 and then apply fillin. Then drop the 12 pseudo observations you inserted. .

        Comment


        • #5
          Thanks Nick, I followed this thread and managed to solve the problem.

          https://www.statalist.org/forums/for...the-panel-data

          Comment


          • #6
            In amplification of #4 here is one of several slightly different ways of doing it. Some people would be happy with typing in the extra 12 observations manually, but for full reporting you should prefer concise, repeatable code.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input byte id int year byte outcome
            1 2012 0
            1 2014 1
            2 2012 1
            2 2013 1
            2 2015 0
            2 2016 0
            2 2018 0
            2 2019 1
            2 2021 1
            end
            
            di _N 
            set obs `= _N + 12'
            replace id = -1 in -12/-1 
            bysort id (year) : replace year = 2010 + _n if id == -1 
            fillin id year 
            replace outcome = 0 if outcome == . 
            drop if id == -1 
            
            list, sepby(id)
            
                 +-------------------------------+
                 | id   year   outcome   _fillin |
                 |-------------------------------|
              1. |  1   2011         0         1 |
              2. |  1   2012         0         0 |
              3. |  1   2013         0         1 |
              4. |  1   2014         1         0 |
              5. |  1   2015         0         1 |
              6. |  1   2016         0         1 |
              7. |  1   2017         0         1 |
              8. |  1   2018         0         1 |
              9. |  1   2019         0         1 |
             10. |  1   2020         0         1 |
             11. |  1   2021         0         1 |
             12. |  1   2022         0         1 |
                 |-------------------------------|
             13. |  2   2011         0         1 |
             14. |  2   2012         1         0 |
             15. |  2   2013         1         0 |
             16. |  2   2014         0         1 |
             17. |  2   2015         0         0 |
             18. |  2   2016         0         0 |
             19. |  2   2017         0         1 |
             20. |  2   2018         0         0 |
             21. |  2   2019         1         0 |
             22. |  2   2020         0         1 |
             23. |  2   2021         1         0 |
             24. |  2   2022         0         1 |
                 +-------------------------------+

            Comment

            Working...
            X