Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New on SSC: listsome - a program to list a small/random sample of observations

    Thanks to Kit Baum, a new program called listsome is now available on SSC.

    As its name suggests, listsome is used to list values of variables for some observations. The observations are listed in the order they appear in the data. Users can specify the maximum number of observations to list (default is 20) and can request a random sample of observations. The random sampling is very fast as the draws are done without changing or sorting the data in memory. Stata 9.2 or higher is needed to run listsome.

    To install, type in Stata's command window:
    Code:
    ssc install listsome
    Once installed, type
    Code:
    help listsome
    to get more information.

    listsome is particularly useful to limit the number observations listed when using the if or in qualifiers. For example,
    Code:
    sysuse nlsw88.dta, clear
    listsome industry-hours if grade > 12
    The first observations that meet the condition may not be representative of the whole so a random sample may be more informative:
    Code:
    listsome industry-hours if grade > 12, random
    When performing data cleaning tasks, listsome can be used to increase transparency by including in the log file a sample that shows the effect of a change on a variable. For example
    Code:
    * separate occupations (note that this could be better done using split)
    decode occupation, gen(stemp)
    
    gen occup1 = regexr(stemp,"/.+","")
    listsome occup* if occup1 != stemp, random max(50)
    
    gen occup2 = regexr(stemp,".+/","") if occup1 != stemp
    listsome occup* if occup1 != stemp, random max(50)
    All valid options for the built-in list command can be used with listsome.
    Code:
    listsome occup* if occup1 != stemp, random max(50) noobs clean
    If the random option is used, make sure to set the seed at the top of your do-file to make the listings reproducible:
    Code:
    set seed 12345
    Last edited by Robert Picard; 18 Aug 2014, 10:32. Reason: damn autocorrect!
Working...
X