I'm just writing to note that -fileread()- and -filewrite()- can function nicely as part of a *faster* and more sophisticated version of -filefilter-.
Illustration: While recently working with a 325M text file, I wanted to filter it in various ways, including the simple task of changing all instances of multiple blanks to one. I started off by using multiple calls to -filefilter- (change 10 blanks to 1, 9 blanks to 1 ...., 2 blanks to 1). I then thought to use -fileread- and -filewrite()- with ordinary string functions:
This approach was not only convenient, but also much faster, at least on a Windows machine. For example, on my 325M file, just the partial task of substituting one blank for two blanks, using -filefilter-, took something like 30 sec. Doing the whole, task, using fileread/write and itrim() as above, took less than 4 sec. In principle, one could use this to create an enhanced version of -filefilter- that accepted regular expressions as needed, worked with binary as well as text files, etc. -fileread- is limited by the 2G limit maximum length of a strL, but that's not typically an issue.
This is one of several instance in which -fileread/write- have impressed me with their speed and convenience.
Regards, Mike
Illustration: While recently working with a 325M text file, I wanted to filter it in various ways, including the simple task of changing all instances of multiple blanks to one. I started off by using multiple calls to -filefilter- (change 10 blanks to 1, 9 blanks to 1 ...., 2 blanks to 1). I then thought to use -fileread- and -filewrite()- with ordinary string functions:
Code:
gen strL s = fileread(infile) replace s = itrim(s) gen b - filewrite(outfile,s)
This is one of several instance in which -fileread/write- have impressed me with their speed and convenience.
Regards, Mike
Comment