Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • destring non numeric values

    Hello
    I am Hatem Ali, STATA user.
    I have a 2 large dataset that I am trying to merge
    The common variable in these 2 datasets is called: trr-id_code.
    However, This variable is a string non-numeric variable

    When I try to merge one by one using key variable. STATA doesnt do and say variables are unequal.

    when I tried to destring it, I got the following message:
    destring trr_id_code, generate(hhh)
    trr_id_code contains nonnumeric characters; no generate

    1st question: Is there a way I can destring this variable so I can use it for merging?
    2nd question: can I use : merge one to one by observation instead?

    Looking forward to hear back from you

  • #2
    Hatem:
    if you have an unique observation for each -trr-id_code-, you may want to try:
    Code:
    . g trr_id_code="A" in 1
    (1 missing value generated)
    
    . replace trr_id_code="B" in 2
    (1 real change made)
    
    . egen id_num=group( trr_id_code)
    
    . list
    
         +-------------------+
         | trr_id~e   id_num |
         |-------------------|
      1. |        A        1 |
      2. |        B        2 |
         +-------------------+
    and then try -merge-.

    As an aside, things would be easier to inspect if you shared an excerpt/example of your data via -dataex-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      first, you can merge on string variables; your report on the message from Stata is unclear, please post exactly what Stata said (and put it in CODE blocks - see the FAQ)

      second, -destring- is for variables that are really numeric but happen to be string due to the way they were imported or input; -encode- for making numeric variables out of string variables

      third, if you really want to merge by observation, go into each data set and
      Code:
      gen newid=_n
      sort newid
      and then -merge- on "newid"

      see
      Code:
      help merge
      help encode

      Comment


      • #4
        Thanks Rich and Carlo for getting back to me.

        This is an example of the data. The variable :trr_id_code is a string variable.
        When I try to merge on it using one by one key code, the following message appear: variable trr_id_code does not uniquely identify observations in the master data

        This message still appear even after removing the duplicates
        trr_id_code dx_date_lymph pathology steroids_maint
        A100140 19/07/1999 1 1
        A100440 29/10/2014 0 1
        A100980 12/07/2013 1 1
        A102013 02/11/2008 0 1
        A102130 13/01/2001 1
        A103307 01/01/2000 1 0
        This is example of second dataset
        trr_id_code alg_ind alg_days
        A100 1 4
        A100001 1 5
        A100002 1 6
        A100005 0 0
        A100006 0 0
        A100007 0 0
        Last edited by hatem ali; 08 Jun 2019, 14:56.

        Comment


        • #5
          This is the dexription and summary of the varname

          storage display value
          variable name type format label variable label

          trr_id_code str16 %16s ENCRYPTED TRR_ID



          . summarize trr_id_code

          Variable Obs Mean Std. Dev. Min Max

          trr_id_code 0

          Comment


          • #6
            Thanks all,

            Got it , done

            Comment

            Working...
            X