Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Encoding issue in RTF with esttab/esttout under Stata 14

    Dear all,

    I updated from Stata 13 to Stata 14 and used the “unicode” command to adapt the labeling in my do-files to unicode.
    I use the esttab command to export to rtf (example, see below). With the update to Stata14 encoding problems emerged. Characters in the rtf-document, such as ä, ö, ü, are no longer displayed in the right way.

    Stata 13 saves the rtf document with ANSI encoding. With Stata 14 the rtf document is saved with utf-8 encoding. Apparently the rtf format can’t handle unicode, but with Stata 14 the esttab command saves the document only in unicode. The rtf header contains still a definition as “ansi” (see below).

    Is there a way to force Stata to save the rtf document with ANSI encoding? Any other suggestions?

    Best,
    Annina

    RTF-Header:
    Code:
      {\rtf1\ansi\deff0 {\fonttbl{\f0\fnil Calibri;}}
    {\info {\author .}{\company .}{\title .}{\creatim\yr2016\mo2\dy8\hr12\min31}}
    \deflang1033\plain\fs16
    {\footer\pard\qc\plain\f0\fs16\chpgn\par}
    Creating RTF-Document:
    Code:
      esttab m1 m2 m3  ///
     using example.rtf, ///
     nonumbers nodepvars nonotes ///
     stats(aic bic N, fmt(%3,2f %3,0f) labels("AIC" "BIC" "N")) ///
        varwidth(35) modelwidth(6) b(%3,2f) se(%3,2f) gaps ///
        starlevels(+ 0.1 * 0.05 ** 0.01 *** 0.001) ///
     refcat (second0 "\emp{\i {\b Jahre}{\line (Ref.: xx)}}" , label(" ")) ///
     order (second0 second1 second3 second4 second5 second6 /*
      */ 0.erwerb_teilzeit 1.erwerb_teilzeit /*
      */ hh1income qu_hh1income) ///
     substitute("\f0\fnil Times New Roman" "\f0\fnil Calibri" "\fs20" "\fs16" "\fs24" "\fs20") /// 
        title ("Tab. 1:  TITLE)") ///
     mtitle("(1)" "(2)" "(3)") ///   
     varlabels (_cons "\emp{\i {\b Konstante}}" /*
      */ qu_hh1income "\emp{\i Quadriertes Äquivalenzeinkommen}" /*
      */ hh1income "\emp{\i Äquivalenzeinkommen}" /*
      */ 0.erwerb_teilzeit "  Elternzeit" /*
      */ 1.erwerb_teilzeit "  Arbeitslos, nicht erwerbstätig") ///
        addnote("Source: [...] ." "Für imputierte Werte durch Flag-Variablen kontrolliert.")



  • #2
    I dont know how to force Stata to do that, but I am surprised, as the special characters visible in your code are actually some of the few also included in the ANSI set.

    Comment


    • #3
      Welcome to Statalist!

      The estout command is not part of core Stata but rather a user-written package published initially in the Stata Journal, as
      Code:
      search estout
      suggests. It is described at
      Code:
      net sj 14-2 st0085_2
      which includes the author's name and email address for support.

      As I see it, the estout command, not core Stata, is creating and writing the RTF headers and content, and so it is estout that needs to be coerced to support ANSI encoding of Unicode characters. Or, if it is possible, creation of an RTF file with Unicode encoding. Perhaps contacting the author will yield assistance.
      Last edited by William Lisowski; 08 Feb 2016, 10:41.

      Comment


      • #4
        If all the Unicode you need deal with is covered within Latin-1 encoding, it is fairly simple to do a translation of the rtf file you get to the ISO-8859-1 encoding.

        What you need do is to use unicode convertfile command. Suppose file before.rtf is what you get from Stata 14, (by the way, which is in UTF-8 encoding):

        Code:
        unicode convertfile before.rtf after.rtf, dstencoding(ISO-8859-1)
        will generate a file after.rtf which is in ISO-8859-1 encoding and should display extended ACSII characters correctly.

        Characters beyond ISO-8859-1, for example, Chinese character, are a totally different animal and have no easy solution.

        Comment


        • #5
          Thank you Hua! Your solution worked out well.

          Comment


          • #6
            Originally posted by Hua Peng (StataCorp) View Post
            If all the Unicode you need deal with is covered within Latin-1 encoding, it is fairly simple to do a translation of the rtf file you get to the ISO-8859-1 encoding.

            What you need do is to use unicode convertfile command. Suppose file before.rtf is what you get from Stata 14, (by the way, which is in UTF-8 encoding):

            Code:
            unicode convertfile before.rtf after.rtf, dstencoding(ISO-8859-1)
            will generate a file after.rtf which is in ISO-8859-1 encoding and should display extended ACSII characters correctly.

            Characters beyond ISO-8859-1, for example, Chinese character, are a totally different animal and have no easy solution.
            Hi Hua,

            I am wondering whether there is already a solution for Chinese characters now.

            Thanks

            Comment

            Working...
            X