Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • String variable to Numeric variable

    Dear everyone,

    I'm working on a large dataset that comes from a regulator. Some variables are csv and I imported in stata.
    I have two main issues and one is the following. I have a string variable that I would like to convert in numeric one where "," is decimal separator and "." is thousand separator.
    I'm wondering what could be the best approach to convert in a simple stata numeric variable

    I tried the
    Code:
    real
    function but it simplies give me missing values for all of them.
    I also explored the properties of the variable:
    Type: str9
    Format: %9s

    Someone that would be so kind with any advice?
    Thank you in advance

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str9 adnt2017
    "0,40"    
    "30,73"   
    "0,09"    
    "66,82"   
    "71,04"   
    "0,04"    
    "0,01"    
    "0,03"    
    "0,28"    
    "134,67"  
    "29,99"   
    "197,51"  
    "0,26"    
    "207,58"  
    "265,90"  
    "2.321,99"
    "502,89"  
    "2,14"    
    "1,19"    
    "2,22"    
    "0,55"    
    "13,51"   
    "2,71"    
    "0,33"    
    "0,28"    
    "0,74"    
    "743,14"  
    "1.287,56"
    "3.107,41"
    "266,04"  
    "573,07"  
    "0,22"    
    "112,46"  
    "0,09"    
    "532,67"  
    "0,40"    
    "0,25"    
    "0,57"    
    "1.497,67"
    "0,00"    
    "194,11"  
    "1,21"    
    "2.754,51"
    "696,78"  
    "0,30"    
    "0,44"    
    "179,96"  
    "1,93"    
    "1,21"    
    "0,71"    
    "4,80"    
    "17,76"   
    "198,40"  
    "0,32"    
    "0,98"    
    "0,89"    
    "1.389,80"
    "41,11"   
    "0,48"    
    "464,54"  
    "28,22"   
    "1.050,54"
    "0,02"    
    "38,68"   
    "1.411,40"
    "4,46"    
    "613,49"  
    "0,47"    
    "767,94"  
    "36,02"   
    "2.278,11"
    "154,79"  
    "710,48"  
    "246,48"  
    "194,41"  
    "38,89"   
    "0,21"    
    "1,07"    
    "866,90"  
    "0,28"    
    "0,03"    
    "0,50"    
    "0,71"    
    "0,00"    
    "1.143,77"
    "2,34"    
    "756,73"  
    "956,82"  
    "102,52"  
    "1.433,31"
    "0,00"    
    "0,02"    
    "0,00"    
    "0,00"    
    "322,59"  
    "0,07"    
    "0,18"    
    "0,00"    
    "0,00"    
    "0,00"    
    end

  • #2
    remove thousand separator,
    Code:
    replace adnt2017 = subinstr(adnt2017,".","",.)
    replace decimal separator,
    Code:
    replace adnt2017 = subinstr(adnt2017,",",".",.)
    destring,
    Code:
    destring(adnt2017), replace

    Comment


    • #3
      the following works on your example data:
      Code:
      destring adnt2017, gen(adnt2017b) dpcomma ignore(".")
      see
      Code:
      help destring

      Comment


      • #4
        Code:
        // this does the work
        destring adnt2017, gen(adnt2017_num) dpcomma ignore(".")
        
        // cosmetic, just add a 1000s mark to make it easier to read
        format adnt2017_num %9.2fc
        
        // admire the result
        list
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Thanks Øyvind Snilsberg and Rich Goldstein

          I finally got a solution with which is exactly what you rich suggest

          I did think about to play with destring function.

          Thanks again for your time!

          Comment

          Working...
          X