Hello. I have over 10.000 txt zipped files which I would like to unzip, convert to stata and append all in only one dta dataset. The problem is that the zipfiles's names are 11 digit number ones, with a big interval between them. So using forvalues combined with capture confirm file takes forever. Stata has been already running for 20 hours and it hasn't even unzipped half of all files. I hope there is a more efficient way of doing so.
Actually is it possible to work with zipped files in Stata without having to unzip them?
Below is the code which is taking forever:
Actually is it possible to work with zipped files in Stata without having to unzip them?
Below is the code which is taking forever:
Code:
forvalues i=11000000000/54000000000 { capture confirm file "`i'.zip" if _rc==0 { unzipfile `i'.zip, replace } } clear tempfile temp save `temp', emptyok forvalues i=11000000000/54000000000 { capture confirm file "`i'.dta" if _rc==0 { use `i'.dta, clear infix uf 1-2 mun 3-7 dist 8-9 subdist 10-11 set 12-15 st_set 16-16 str lati 322-336 str longe 337-351 /// tp 472-473 str subtp 474-513 quadra 545-547 face 548-550 cep 551-558 using 11000150500.txt append using `temp' display "`i'" save `"`temp'"', replace } } save all.dta, replace
Comment