I have a very very large dataset (~200 mil obs) which I am opening inside of a loop. I'm wondering whether it would be faster to open it once and use preserve/restore inside of the loop instead. Sample code below and any help would be greatly appreciated.
forvalues i=1/$max{
use "XXX\large_data.dta", clear
keep if geocode="`i'"
save "XXX\large_data_`i'.dta", replace
}
or
use "XXX\large_data.dta", clear
forvalues i=$max{
preserve
keep if geocode="`i'"
save "XXX\large_data_`i'.dta", replace
restore
}
forvalues i=1/$max{
use "XXX\large_data.dta", clear
keep if geocode="`i'"
save "XXX\large_data_`i'.dta", replace
}
or
use "XXX\large_data.dta", clear
forvalues i=$max{
preserve
keep if geocode="`i'"
save "XXX\large_data_`i'.dta", replace
restore
}
Comment