Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Destring ICD-10 codes

    Dear Statalist,

    I'm trying to destring a variable containing ICD-10 codes.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str7 ICD10
      
    "J44.9"
    "J44.0"
    "J44.1"
    "J44.0"
    "J44.0"
    "J44.0"
    "J44.9"
    "M81.9"
    "J22"  
    "J44.9" 
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.0"
    "J44.1"
    "J44.9"
    "J44.9"
    "J44.8"
    "J45.9"
    "J44.0"
    "J44.0"
    "J44.0"
    "M80.9"
    "M81.0"
    "M81.9"
    "J84.9"
    "J84.9"
    "J84.9"
    "J84.9"
    "J84.9"
    "J84.9"
    "J84.9"
    "M81.0"
    "J22"  
    "M81.0"
    "M81.5"  
    "J45.9"
    "J44.8"
    "J45.9"
    "J44.9"
    "J44.0"
    "J40"  
    "J44.9"
    "J45.9" 
    "J20.9"
    "J44.0"
    "J44.0"
    "J45.9"
    "M81.0"
    "J44.9"
    "J44.9"
    "J22"  
    "M81.0"
    "M81.0"
    "J44.8"
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.9"
    "J44.0"
    "J44.9"
    "J44.0"
    "J44.9"
    "J44.0"
    "J44.9"
    "J20.9"
    "J44.0"
    "J44.0"
    "J44.1"
    "J44.9"
    "J44.1"
    "J44.1"
    end
    I tried using

    Code:
    destring ICD10, gen(ICD10*)
    ICD10: contains nonnumeric characters; no generate
    and

    Code:
    destring ICD10, gen(ICD10*) force
    ICD10: contains nonnumeric characters; ICD10* generated as byte
    (138380 missing values generated)
    The last code generating all missing values.

    I need to destring to the default format numeric(double) due to merging with another dataset.

    Any suggestions?

  • #2
    You can only destring numeric strings, so I am not sure what you have in mind as your final result.

    I need to destring to the default format numeric(double) due to merging with another dataset.
    You can merge using string identifiers, so if this is your reason for destringing, then it is ill advised. Once you have all the variables merged into one dataset and need to create a numerical identifier, see

    Code:
    help encode

    Comment


    • #3
      Like Andrew, I'm unclear as to what you want the result to be.

      Perhaps the icd10 command can assist in what you need to do.
      Code:
      help icd10
      There are other possibilities, but since having an identifier stored as a number, double or otherwise, with a fractional part is problematic, I'm reluctant to suggest them. Really, if "J44.9" is stored as 44.9 in the dataset you are merging to, I'd be concerned. Using fractional numbers as merge keys is just asking for trouble.

      Comment


      • #4
        For other readers: the International Classification of Diseases (ICD) codes are used to identify specific diseases. For example, if a health insurance claim has the ICD-10 code V95.43XA, you know that the person involved was injured or killed in a spacecraft collision. This is not a joke, or a typo. It's an actual code. (Presumably nobody sees this one very often, especially considering that space is vast and spacecraft are unlikely to collide, plus they move extremely fast and it's more likely that a collision would vaporize the occupants rather than just injure them.) Nevertheless, there are many more common codes for things like heart disease, depression, and what have you.

        ICD codes should always be stored as strings. Many ICD version 9 (ICD-9) diagnosis codes may look numeric, but a) some of them start with leading zeroes, which are informative, and b) some of them start with letters. Almost ICD-10 codes have letters anyway.

        Perhaps Sigrid wants to generate flags to denote the presence of certain ICD-10 codes. the icd10 command can do that. Also, it can designate code ranges. You don't have to type out each code by hand! (Frequently, codes for some disease will span a range, e.g. V95.00XA to V95.9XXS will designate any sort of accident, e.g. collision, crash, fire, involving any sort of powered aircraft, e.g. helicopters, private fixed-wing, commercial fixed-wing, spacecraft.) This seems like it would be better.

        Perhaps Sigrid was provided an Excel sheet with some codes of interest marked, and there are too many to conveniently convert them to a range. In that case, as already observed, you can merge on a string variable.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment


        • #5
          Dear all,

          Thank you for your great responses, and for pointing me towards the icd10 commands. I converted the ICD10 codes stored as numeric into string, and went forward using icd10 check, clean and generate.

          All the best,
          Sigrid

          Comment

          Working...
          X