It is complicated. But simply put dtaverify is academically correct (works according to specification), but is not practically correct (does not handle all real life situations). There was a version of Stata with a bug in it. It created malformed dta files, which it was itself immune to. Furthermore, the bug was later fixed by StataCorp. So newer versions of Stata do not cause this problem. However if the datafile is produced by the original Stata 13.0 (first DVD production, June 2013) it will not be valid according to the dtaverify's standard (specification 117).
I guess Stata developers had a choice, what to consider a valid file:
A) any file that Stata can open;
B) any file that corresponds to the official published specification;
It seems they opted for option B, which in practice means that some of the files that Stata produced earlier are not considered valid by the verifier, and may not be imported correctly to the third party software that is not aware of that bug.
Anyhow, this problem is curable. There is enough information in the file to recover the corrupt part, and this is what use13_fix() is doing. Furthermore, Stata itself (all 13.x versions) doesn't seem to be affected by this problem, at least I haven't seen it yet. So unless you want to import your datafiles into some third party software, there is no reason to worry.
If there is a file that use13_fix() does not fix, I will need to see the data. Guessing is a lot of work, which I can't afford at the moment.
Best, Sergiy
I guess Stata developers had a choice, what to consider a valid file:
A) any file that Stata can open;
B) any file that corresponds to the official published specification;
It seems they opted for option B, which in practice means that some of the files that Stata produced earlier are not considered valid by the verifier, and may not be imported correctly to the third party software that is not aware of that bug.
Anyhow, this problem is curable. There is enough information in the file to recover the corrupt part, and this is what use13_fix() is doing. Furthermore, Stata itself (all 13.x versions) doesn't seem to be affected by this problem, at least I haven't seen it yet. So unless you want to import your datafiles into some third party software, there is no reason to worry.
If there is a file that use13_fix() does not fix, I will need to see the data. Guessing is a lot of work, which I can't afford at the moment.
Best, Sergiy
Comment