Hi,
I'm trying to split a master dataset into its constituent country parts. I'm using large datasets (300g) that currently are taking more than 48 hours to run, so any help speeding the process would be much appreciated.
Using this data:
and this Stata code:
Thanks
Ciaran
I'm trying to split a master dataset into its constituent country parts. I'm using large datasets (300g) that currently are taking more than 48 hours to run, so any help speeding the process would be much appreciated.
Using this data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str20 bvd_id_number strL main_activity str2 countrycode "CN9360430024" "Manufacturing" "CN" "AU072891993" "Services" "AU" "US149668182L" "Manufacturing" "US" "US133096011L" "Services" "US" "CA32531NC" "Services" "CA" end
and this Stata code:
Code:
//create country list glevelsof countrycode, local(countries) //timer on timer on 1 parallel: foreach c of local countries { use overviews.dta, clear keep if countrycode == "`c'" save `c', replace } timer off 1 timer list 1
Code:
//timer on . . timer on 1 . . parallel: foreach c of local countries { -------------------------------------------------------------------------------- Parallel Computing with Stata (by GVY) Clusters : 4 pll_id : rp2wznupm1 Running at : D:\Firmographics\overviews\parallell_test Randtype : datetime Waiting for the clusters to finish... -3621 cluster 0004 has exited without error... -3621 cluster 0001 has exited without error... -3621 cluster 0002 has exited without error... -3621 cluster 0003 has exited without error... -------------------------------------------------------------------------------- Enter -parallel printlog #- to checkout logfiles. -------------------------------------------------------------------------------- unlink(): 3621 attempt to write read-only file parallel_recursively_rm(): - function returned error parallel_clean(): - function returned error <istmt>: - function returned error r(3621); end of do-file
Ciaran
Comment