Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata unicode issue - imported Japanese characters are scrambled

    Hello:

    I'm using Mac Sierra (10.12.5) and Stata 15.0. I have a lot of csv files that I need to import into Stata, but when I do, the Japanese characters for my string variables are scrambled. I tried resetting my OS language to Japanese. Doing so allowed me to open the csv files with Excel with the Japanese appearing correctly, however, I still had the same problem when I directly imported the files into Stata.

    I did look into setting the unicode encoding and tried a few other ones out, but I couldn't solve this problem. This is probably a very specific problem, but if anyone has any ideas about what I might do, I would be very appreciative.

    Thank you and happy new year.

    Best,
    Gene Park

  • #2
    Sounds like they're in Shift-JIS or something similar.

    You can search the Internet for ways to convert the file into UTF-8. Perhaps there's something already in the Macintosh OS, inherited from UNIX. In the 1980s, there to be several freebie DOS programs that would make conversions between the various encodings. Maybe there are analogues for the Macintosh if there's nothing built-in.

    Stat/Transfer will nail it, if you have access to that.

    Otherwise, write an Excel macro or VBA ditty to open them all in Excel (with your OS language set to Japanese) and save them as worksheets in a workbook, which will force the contents into Unicode.

    Edited to add: It looks like there is an option for Shift-JIS in Stata's delimited file import dialogue. Have you tried that? (Maybe try again after setting your operating system set to Japanese.)
    Last edited by Joseph Coveney; 04 Jan 2019, 00:29.

    Comment


    • #3
      Joseph: Thanks so much. I'll give your suggestions a try as soon as I get chance. Best, Gene

      Comment


      • #4
        Joseph: Your final suggestion worked perfectly! And your guess was correct. All one needs to do is add: encoding(shift_jis) after the import command. Thank you!

        Comment

        Working...
        X