Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • End-of-file Problems

    Hey everyone,
    I'm having an interesting issue insheeting a very large .csv dataset. One of the fields consists of a series of random characters which sporadically contain DOS end-of-file markers. These EOF markers inevitably stop the import leading to truncated data. If I could open the .csv file I could replace the EOF characters with something else, but it's too big to open in any program that I've tried. I know in SAS there is an option to ignore the EOF markers (IGNOREDOSEOF). Is there anything like this in Stata? If I had SAS I could import, then just export a .dta, but I don't. Any ideas?

    Any help is appreciated. Thanks!

  • #2
    Use filefilter to replace those characters with something else, import and replace back. For example if the # sign is never used, you can temporarily use it as a placeholder for EOFs.
    Sergiy

    Comment


    • #3
      to expand slightly on Sergiy's response: I would start with hexdump (use either the analyze (my preference) or the tabulate options) to see what is never used; if it is something "odd", you can download (findit using search) the asciiplot command to get the right code

      Comment


      • #4
        I appreciate the input guys! Yes, I found a solution following your suggestions. Saved me!

        Comment

        Working...
        X