Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Could not find a good delimiter to importing txt files

    Dear all,

    I am writing to ask for a favor from the wisdom.

    I am trying to import the attached .txt data into Stata. The data looks well separated by "space" if it is opened in a text document.

    What I was doing in Stata is a very standard approach: file -> import -> text data (delimiter), so that I could have a preview before importing the data.

    However, I could not find a good delimiter to separate the columns well, and the imported data is very messy by using any possible delimiter in the options window. I am wondering if I am missing something important.

    Thank you very much in advance and I look forward to hearing from you!

    Best regards,
    Long






    Attached Files

  • #2
    Your text does not have delimiters between the input fields. Instead it is in a fixed width format with one or more blanks between the fields, and that's not going to work for you because the second field - the name - contains one or more names and initials separated by spaces. So you cannot use spaces as your delimiter because the name contains spaces.

    It is possible to read fixed width data into Stata using the infile or infix commands. But in both those cases, the first step is to prepare a "dictionary file" telling Stata what columns each field start and end in. See help infile2 or help infix for details.

    You might find it easier, if this is a one-time event, to launch Excel and from the File menu choose Import, and in the dialog box, choose Fixed Width and then proceed to let Excel help you read the file in. Once you have it in Excel, using Stata's import excel will read it in quickly and easily.

    Comment


    • #3
      William has it right. There's an option with infix to put the specifications in-line. See the help file. Yours would look something like the following.
      Code:
      infix str some_id 1-6 str husband_name 14-45 byte husband_age 46-47 str wife_name 58-89 wife_age 90-91 ///
          byte has_it 103 str birth_date 124-133 str start_date 139-148 ///
          str county_code 154-156 str county_name 175-194 using data.TXT
      list, noobs abbreviate(20)
      assert missing(some_id) in 10/l

      Comment


      • #4
        Thank you, William! That's very informative!

        Comment


        • #5
          Thank you Joseph! I have just implemented your codes. With some minor adjustment, it works for all my datasets!

          Comment

          Working...
          X