Encoding issue in RTF with esttab/esttout under Stata 14

Annina Hering

Join Date: Feb 2016
Posts: 2

Encoding issue in RTF with esttab/esttout under Stata 14

08 Feb 2016, 05:39

Dear all,

I updated from Stata 13 to Stata 14 and used the “unicode” command to adapt the labeling in my do-files to unicode.
I use the esttab command to export to rtf (example, see below). With the update to Stata14 encoding problems emerged. Characters in the rtf-document, such as ä, ö, ü, are no longer displayed in the right way.

Stata 13 saves the rtf document with ANSI encoding. With Stata 14 the rtf document is saved with utf-8 encoding. Apparently the rtf format can’t handle unicode, but with Stata 14 the esttab command saves the document only in unicode. The rtf header contains still a definition as “ansi” (see below).

Is there a way to force Stata to save the rtf document with ANSI encoding? Any other suggestions?

Best,
Annina

RTF-Header:

Code:

  {\rtf1\ansi\deff0 {\fonttbl{\f0\fnil Calibri;}}
{\info {\author .}{\company .}{\title .}{\creatim\yr2016\mo2\dy8\hr12\min31}}
\deflang1033\plain\fs16
{\footer\pard\qc\plain\f0\fs16\chpgn\par}

Creating RTF-Document:

Code:

  esttab m1 m2 m3  ///
 using example.rtf, ///
 nonumbers nodepvars nonotes ///
 stats(aic bic N, fmt(%3,2f %3,0f) labels("AIC" "BIC" "N")) ///
    varwidth(35) modelwidth(6) b(%3,2f) se(%3,2f) gaps ///
    starlevels(+ 0.1 * 0.05 ** 0.01 *** 0.001) ///
 refcat (second0 "\emp{\i {\b Jahre}{\line (Ref.: xx)}}" , label(" ")) ///
 order (second0 second1 second3 second4 second5 second6 /*
  */ 0.erwerb_teilzeit 1.erwerb_teilzeit /*
  */ hh1income qu_hh1income) ///
 substitute("\f0\fnil Times New Roman" "\f0\fnil Calibri" "\fs20" "\fs16" "\fs24" "\fs20") /// 
    title ("Tab. 1:  TITLE)") ///
 mtitle("(1)" "(2)" "(3)") ///   
 varlabels (_cons "\emp{\i {\b Konstante}}" /*
  */ qu_hh1income "\emp{\i Quadriertes Äquivalenzeinkommen}" /*
  */ hh1income "\emp{\i Äquivalenzeinkommen}" /*
  */ 0.erwerb_teilzeit "  Elternzeit" /*
  */ 1.erwerb_teilzeit "  Arbeitslos, nicht erwerbstätig") ///
    addnote("Source: [...] ." "Für imputierte Werte durch Flag-Variablen kontrolliert.")

Tags: None

Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#2

08 Feb 2016, 08:12

I dont know how to force Stata to do that, but I am surprised, as the special characters visible in your code are actually some of the few also included in the ANSI set.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

08 Feb 2016, 09:37

Welcome to Statalist!

The estout command is not part of core Stata but rather a user-written package published initially in the Stata Journal, as

Code:

search estout

suggests. It is described at

Code:

net sj 14-2 st0085_2

which includes the author's name and email address for support.

As I see it, the estout command, not core Stata, is creating and writing the RTF headers and content, and so it is estout that needs to be coerced to support ANSI encoding of Unicode characters. Or, if it is possible, creation of an RTF file with Unicode encoding. Perhaps contacting the author will yield assistance.

Last edited by William Lisowski; 08 Feb 2016, 09:41.
Comment
Hua Peng (StataCorp)

StataCorp Employee

Join Date: Jun 2014

Posts: 343
#4

08 Feb 2016, 16:14

If all the Unicode you need deal with is covered within Latin-1 encoding, it is fairly simple to do a translation of the rtf file you get to the ISO-8859-1 encoding.

What you need do is to use unicode convertfile command. Suppose file before.rtf is what you get from Stata 14, (by the way, which is in UTF-8 encoding):

Code:

unicode convertfile before.rtf after.rtf, dstencoding(ISO-8859-1)

will generate a file after.rtf which is in ISO-8859-1 encoding and should display extended ACSII characters correctly.

Characters beyond ISO-8859-1, for example, Chinese character, are a totally different animal and have no easy solution.
1 like
Comment
Annina Hering

Join Date: Feb 2016

Posts: 2
#5

09 Feb 2016, 02:53

Thank you Hua! Your solution worked out well.
Comment
Bo Chen

Join Date: May 2015

Posts: 29
#6

25 Nov 2018, 22:01

Originally posted by Hua Peng (StataCorp) View Post

If all the Unicode you need deal with is covered within Latin-1 encoding, it is fairly simple to do a translation of the rtf file you get to the ISO-8859-1 encoding.

What you need do is to use unicode convertfile command. Suppose file before.rtf is what you get from Stata 14, (by the way, which is in UTF-8 encoding):

Code:

unicode convertfile before.rtf after.rtf, dstencoding(ISO-8859-1)

will generate a file after.rtf which is in ISO-8859-1 encoding and should display extended ACSII characters correctly.

Characters beyond ISO-8859-1, for example, Chinese character, are a totally different animal and have no easy solution.

Hi Hua,

I am wondering whether there is already a solution for Chinese characters now.

Thanks
Comment

Announcement

Encoding issue in RTF with esttab/esttout under Stata 14

Comment

Comment

Comment

Comment

Comment