Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop Records Missing Majority of Observations of Variables

    I have a data set in which several records are missing observations for the majority of the variables (in Excel, several rows would be blank for the majority of the columns). Is there a command to remove these particular records. I don't want to haphazardly drop records that might be missing the occasional observation, just those that are mostly missing (195 missing out of 210 variables).

  • #2
    See -dropmiss- from SSC (ssc install dropmiss) to remove completely missing rows/obs (or alternatively variables/cols). You can calculate the # of missing variables/cols for each row and then drop according to a rule:

    Code:
    sysuse auto, clear
    egen x = rownonmiss(mpg price rep78 for)
    ta x
    drop if x<4
    Last edited by eric_a_booth; 28 Feb 2017, 12:43.
    Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

    Comment


    • #3
      As a footnote to Eric's post,

      1. dropmiss is not a package on SSC.

      2. A
      search reveals its history:


      Code:
      . search dropmiss, historical
      
      Search of official help files, FAQs, Examples, SJs, and STBs
      
      SJ-15-4 dm0085  Speaking Stata: A set of utilities for managing missing values
              (help missings if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q4/15   SJ 15(4):1174--1185
              provides command, missings, as a replacement for, and extension
              of, previous commands nmissing and dropmiss
      
      SJ-15-4 dm89_2  . . . . . . . . . . . . . . . . . Software update for dropmiss
              (help dropmiss if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q4/15   SJ 15(4):1186--1187
              dropmiss command has been superseded by a new command, missings,
              which offers various utilities for managing variables that may
              have missing values
      
      SJ-8-4  dm89_1  . . . . Dropping variables or observations with missing values
              (help dropmiss if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q4/08   SJ 8(4):594
              update in style and content; added a new force option
      
      STB-60  dm89  . . . . . Dropping variables or observations with missing values
              (help dropmiss if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              3/01    pp.7--8; STB Reprints Vol 10, pp.44--46
              drops variables or observations with all values (optionally
              any values) missing


      The top line (you heard it here first) is the one to note. dropmiss is considered superseded by missings by their author (c'est moi).

      Dropping observations in which some variables are missing is now more difficult if you use
      missings, because it is not clear to me that it is generally good practice when multiple imputation is an alternative.

      Comment


      • #4
        Thanks Nick - sorry for the mixup - I should have checked the source before posting (I'm pretty sure I've messed this up on SL previously)!
        Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

        Comment


        • #5
          Not to worry: I often don't remember where my own programs are made public. I had to compile a list (ssc inst njc_stuff)

          Comment

          Working...
          X