Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    It is complicated. But simply put dtaverify is academically correct (works according to specification), but is not practically correct (does not handle all real life situations). There was a version of Stata with a bug in it. It created malformed dta files, which it was itself immune to. Furthermore, the bug was later fixed by StataCorp. So newer versions of Stata do not cause this problem. However if the datafile is produced by the original Stata 13.0 (first DVD production, June 2013) it will not be valid according to the dtaverify's standard (specification 117).

    I guess Stata developers had a choice, what to consider a valid file:
    A) any file that Stata can open;
    B) any file that corresponds to the official published specification;


    It seems they opted for option B, which in practice means that some of the files that Stata produced earlier are not considered valid by the verifier, and may not be imported correctly to the third party software that is not aware of that bug.

    Anyhow, this problem is curable. There is enough information in the file to recover the corrupt part, and this is what use13_fix() is doing. Furthermore, Stata itself (all 13.x versions) doesn't seem to be affected by this problem, at least I haven't seen it yet. So unless you want to import your datafiles into some third party software, there is no reason to worry.

    If there is a file that use13_fix() does not fix, I will need to see the data. Guessing is a lot of work, which I can't afford at the moment.

    Best, Sergiy

    Comment


    • #17
      Thanks, I will let you know what I can about the value label issues not fixed so far, but you have a very useful tool! Two more asides:
      1. StataCorp told me that having corrupted loaded could be the cause of crashes. So I am still confused whether all these errors were a distraction because if Stata can -use- a file, nothing can go wrong afterwards. But yes, I need to stop worrying about this.
      2. If I follow what you're saying, it is still the case that Stata ships with auto.dta malformatted with the Stata 13.0 bug. This is sad, though maybe only a cosmetic error. (Unless third-party developers use the same files to test their code.)
      Thanks again, Sergiy, this was very useful, this knowledge should spread, as should your tool.

      Comment


      • #18
        By the way, -dtaverify- was the first command I saw it had an aside to programmers like this in its help file: Aside for programmers
        The source code for dtaverify may be of interest to Stata programmers for two reasons:
        1. It provides a useful secondary description of the file formats.
        2. It provides an example of how code can be written in Mata to read complicated binary formats.
        dtaverify, a command stored in dtaverify.ado, is merely a switcher that jumps to other, standard-specific routines. It is not interesting, but the standard-specific routines are interesting. We recommend you see viewsource dtaverify_117.ado.

        Comment

        Working...
        X