Dear Statalisters,
I am using Stata/MP2 14 for Windows. I have a folder with more than 40,000 textfiles.
Since those files come in different encodings, I want to use Stata's new -unicode translate - features to translate those files to UTF-8.
In this context, I stumbled across a problem.
Here is an example:
The code generates a folder with 22,222 files and then tells Stata to - unicode analyze - each file.
However, it seems that Stata only analyzes 10,000 of those files, here is the result:
So 12,222 files are not analyzed, the same happens with - unicode translate -. 10,000 seems to be a general limit.
Can anybody reproduce this result, or does anyone have a solution? Please let me know, I would be very grateful.
Many thanks
Ali
I am using Stata/MP2 14 for Windows. I have a folder with more than 40,000 textfiles.
Since those files come in different encodings, I want to use Stata's new -unicode translate - features to translate those files to UTF-8.
In this context, I stumbled across a problem.
Here is an example:
Code:
local initial_dir `c(pwd)' **** create a folder capture mkdir folder cd folder **** create and store 20,000 datasets qui { forvalues f=1/22222 { noisily di "`f' of 22222" clear set obs 1 gen str string=`"test string"' save file`f', replace } } **** unicode analysis clear set more on unicode analyze * clear all cd `initial_dir'
However, it seems that Stata only analyzes 10,000 of those files, here is the result:
Code:
File summary (before starting): 10000 file(s) specified 10000 file(s) to be examined ...
Can anybody reproduce this result, or does anyone have a solution? Please let me know, I would be very grateful.
Many thanks
Ali
Comment