Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Divide the string including chinese into two columns

    dear sir:
    I want to know how to divide the following strings into two columns, one for Chinese characters and one for numbers. thank you in advance?
    Example generated by -dataex-. To install: ssc install dataex
    clear
    input str18 var1
    "北京1,520,228.3"
    "天津391,875.1"
    "河北347,574.0"
    "山西7,058.6"
    "内蒙古59,684.1"
    "辽宁544,727.2"
    "吉林52,199.7"
    "黑龙江113,987.5"
    end
    北京 1520228.3
    天津 391875.1
    河北 347574
    山西 7058.6
    内蒙古 59684.1
    辽宁 544727.2
    吉林 52199.7
    黑龙江 113987.5
    Last edited by Liu wei; 22 Apr 2019, 08:46.

  • #2
    Code:
    gen s = ustrregexs(1) if ustrregexm(var1, "(^[^0-9]+)[0-9]")
    
    gen double n = real(subinstr(ustrregexs(1),",","",.)) if ustrregexm(var1, "([0-9,.]+$)")
    Last edited by Bjarte Aagnes; 22 Apr 2019, 08:55.

    Comment


    • #3
      It's too awesome,thank you very much! Bjarte Aagnes!!

      Comment


      • #4
        Thanks, Liu

        Another regex solution is to use the Unicode script property:
        Code:
        gen s = ustrregexs(1) if ustrregexm(var1, "^(\p{Han}+)")
        
        gen double n = real(subinstr(subinstr(var1,s,"",1),",","",.))
        Refs:
        https://www.regular-expressions.info/unicode.html
        http://userguide.icu-project.org/strings/regexp
        Last edited by Bjarte Aagnes; 22 Apr 2019, 09:54.

        Comment


        • #5
          thank you !

          Comment

          Working...
          X