Unicode error

Axel Demenet

Join Date: Apr 2015

Posts: 11
#1

Unicode error

17 Feb 2016, 07:52

I am trying to properly display a dataset from Peru (ENAHO), which currently returns ugly coding errors for variable names and labels. I get an error running "unicode analyze" on which I have no clue.
1st module of the survey has 297 vars, 9600obs (not huge). I cannot get Stata to unicode analyze any of the modules.

Running:

clear all
cd "$path"
unicode analyze Enaho01-2015-100.dta

I get the following error:

File summary (before starting):
1 file(s) specified
1 file(s) to be examined ...

File Enaho01-2015-100.dta (Stata dataset)
1 variable name needs translation
252 variable labels need translation
st_vlload(): 3300 argument out of range
examine_dta_vallab_content(): - function returned error
examine_dta_vallabs_content(): - function returned error
examine_dta_file(): - function returned error
examine_file(): - function returned error
do_examine_files(): - function returned error
unicode_do(): - function returned error
unicode_analyze(): - function returned error
<istmt>: - function returned error
Tags: unicode, unicode analyze, utf-8

Friedrich Huebler

Join Date: Apr 2014
Posts: 1053

17 Feb 2016, 09:28

Did you obtain the data directly from INEI? I downloaded a file with a similar name ("Características de la Vivienda y del Hogar" from the 2015 ENAHO, first trimester) from http://iinei.inei.gob.pe/microdatos/index.htm and could unicode translate it without problems.

Code:

. unicode encoding set latin1
  (default encoding now latin1)

. unicode analyze "enaho01_2015_100.dta"

  File summary (before starting):
        1  file(s) specified
        1  file(s) to be examined ...

  File enaho01_2015_100.dta (Stata dataset)
        1 variable name needs translation
      252 variable labels need translation
       36 value-label contents need translation
          -----------------------------------------------------------------------------------------
          File needs translation.  Use unicode translate on this file.

  File enaho01_2015_100.dta needs translation

  File summary:
        1 file(s) need translation

. unicode translate "enaho01_2015_100.dta"
  (using latin1 encoding)

  File summary (before starting):
        1  file(s) specified
        1  file(s) to be examined ...

  File enaho01_2015_100.dta (Stata dataset)
      296 variable names okay, ASCII
        0 variable names okay, already UTF-8
        1 variable name translated
      all data labels okay, ASCII
       45 variable labels okay, ASCII
        0 variable labels okay, already UTF-8
      252 variable labels translated
      all value-label names okay, ASCII
       77 value-label contents okay, ASCII
        0 value-label contents okay, already UTF-8
       36 value-label contents translated
      all str# variables okay, ASCII
          -----------------------------------------------------------------------------------------
          File successfully translated

  File summary:
      all files successfully translated

Comment

Axel Demenet

Join Date: Apr 2015

Posts: 11
#3

17 Feb 2016, 09:42

Many thanks for this effort!
Yes, I downloaded from the very same website... I am using stata14 and files location is a dropbox (although it worked well on a different version of the survey, also in a DB). I really can't see where things go wrong. I may end up trying on a different computer
Comment
Friedrich Huebler

Join Date: Apr 2014

Posts: 1053
#4

17 Feb 2016, 09:50

Did you try loading the file from a local drive on your PC? Also make sure that your Stata is up to date.
Comment
Axel Demenet

Join Date: Apr 2015

Posts: 11
#5

17 Feb 2016, 10:21

I tried both

I finally figured it out (a frustrating explanation):
the first attempts were made on a database translated from SPSS using Stattransfer 11. I ran the conversion again with ST12... and it works.

Thanks again Friedrich
Comment
Friedrich Huebler

Join Date: Apr 2014

Posts: 1053
#6

17 Feb 2016, 11:20

INEI provides the ENAHO data in SPSS and Stata format. Why don't you download the Stata files?

Now I also understand why the names of our files are different. I downloaded a ZIP archive containing the Stata file enaho01_2015_100.dta. Your file is called Enaho01-2015-100.dta because you converted the SPSS file Enaho01-2015-100.sav to Stata format.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment