Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tabulating missing data for panel data

    Hello,

    I am trying to grasp what data is missing in a dataset of (longitudinal) panel data. I will be using gllamm on this dataset. Does gllamm drop all data for a subject if it is missing data from any year? If so, I would like my missing data reports to be able to tell me how many observations have missing data for any year in the dataset.

    Also, I'm trying to figure out the best/most appropriate command(s) for tabulating missing data. Right now I am using misschk. Is there something else that I should use instead of or in addition to misschk?

    Many thanks,
    Alyssa

  • #2
    You might want to check the followings: 1 and 2 .

    Comment


    • #3
      Alyssa: gllamm and misschk are community-contributed commands, as you are asked to explain (FAQ Advice #12). While you're visiting, please see comments in #16 on how to close threads which apply to some recent threads you started.

      A good way to find out what commands like gllamm do given missing data is look after fitting a model to see which observations were excluded and which included. See e.g. https://www.stata-journal.com/sjpdf....iclenum=dm0030 I'd be surprised if gllamm insists on complete panels and ignores incomplete ones, but I've never used it and what it does beats my surprise, always.

      It's hard to say what's best, particularly for you, for showing, or more generally reporting on, missing data. misstable (official Stata) and missings (Stata Journal) are other offerings.

      SJ-17-3 dm0085_1 . . . . . . . . . . . . . . . . Software update for missings
      (help missings if installed) . . . . . . . . . . . . . . . N. J. Cox
      Q3/17 SJ 17(3):779
      identify() and sort options have been added

      SJ-15-4 dm0085 Speaking Stata: A set of utilities for managing missing values
      (help missings if installed) . . . . . . . . . . . . . . . N. J. Cox
      Q4/15 SJ 15(4):1174--1185
      provides command, missings, as a replacement for, and extension
      of, previous commands nmissing and dropmiss


      Last edited by Nick Cox; 15 Aug 2018, 11:12.

      Comment


      • #4
        Thank you Amin and Nick. Nick Cox I am a bit confused, I am not understanding how to use the article you linked to to figure out which observations were excluded after I run gllamm. Could you please explain a bit more simply?

        Comment


        • #5
          No; sorry, but I don't think I can. I was answering which are "the best/most appropriate command(s) for tabulating missing data" and my answer is no more than that missings may help. Your question isn't precise and I can't match it with more precision. If you sketch what kind of output you want, there might be a better reply.

          Comment


          • #6
            My apologies for my lack of clarity Nick Cox . I am trying to get a feel for what the various missing data tabulations there are in Stata, and specifically what would be helpful for clustered and longitudinal data. For example, one thing that would be interesting to find out is if certain variable(s) are more often missing in certain years. I have found this to be the case by typing:

            Code:
            count if missing(var1) and Year==2012
            count if missing(var1) and Year==2013
            etc. But it would be great to have a less tedious way to discover these missing patterns.

            It would also be good to know if more data is missing for certain unique subject ID's. Since my data is in the long form, things like misschk and mvpatterns (user contributed commands to see patterns in missing data), won't show the patterns of missing data over the years for a single subject ID since each year is a different row of data. Also, due to my data structure, I really can't do missing pattern analyses on data in the wide form. This data is on participation in a program, and participants can have started and stopped the program at any year in our dataset. So, if participants were not involved in the program in a given year, then there is no data for that year and everything looks like it's missing. However, it's not really "missing" because they weren't involved in the program. So, once I reshaped data to the long form I dropped all observations where the participant wasn't involved in the program for a given year.

            I hope this is a more adequate explanation, and please let me know if you need more clarification.

            Many thanks,
            alyssa

            Comment


            • #7
              Code:
              tab Year if missing(var1) 
              
              search xtpatternvar
              are some possibilities.

              Comment


              • #8
                Hi Nick,

                Thanks, I have decided to go for the "tab Year if missing(var1)" option. This works nicely since I don't have a lot of variables with missing data.

                -alyssa

                Comment

                Working...
                X