That approach would not work because once you keep observations with form1, you will not find any with form3 as they have already been dropped. Your error was that the pattern is missing a right single quote when you refer to j:
If you are going to prune files upfront, you might as well break-up the filename into the parts you wanted from the start. That way you can make sure that all filenames you want to process match your expectations. You could do something like:
Note that there is a limit of 10 (I think) match strings when using inlist() with strings. If you have more, you can make a separate dataset with the list to use and use merge to reduce the observations to those that match the list.
Here's an expanded version of the program that handles the extra part variables:
Code:
keep if strmatch(filename, "*`j'*.txt")
Code:
clear all filelist, dir("text_files") * reduce to files with a ".txt" file extension keep if strmatch(filename, "*.txt") * split the file name into parts gen s = subinstr(filename,".txt", "", 1) split s, parse("_") rename (`r(varlist)') (id date form) assert !mi(id, date, form) * reduce to form1 and form3 keep if inlist(form, "form1", "form3")
Here's an expanded version of the program that handles the extra part variables:
Code:
* code to import one text file program import_txt // move values of interest from variables to locals local dsource = dirname local fsource = filename local id1 = id local date1 = date local form1 = form import delimited using `"`dsource'/`fsource'"', clear stringcols(_all) varnames(nonames) // get the desired info keep if strpos(v1,"name:") gen name = subinstr(v1,"name:","",1) // copy over the file's information gen sourcefile = `"`fsource'"' gen sourcedir = `"`dsource'"' gen id = "`id1'" gen date = "`date1'" gen form = "`form1'" end runby import_txt, by(dirname filename) verbose
Comment