Hi,
I have spend couple days searching and trying to recode a string variable that contain letter and numbers (ICD10 Codes) into a digit number corresponding to the list of ICD10 of the 298 main causes of morbidity. I know that string variable cannot be recoded. In SPSS it is possible although the syntax is very lengthy. However, in STATA I was not able to even copy and past the codes because it is so many.
I will attach a sample of Excel file that contains the list of codes of the 298 codes and corresponding number to each cause. So, for example in row 1 column B the code is A00 and I want to recode it A00-A00.9 into 1, row two is A01 so A01* into 2, row 8 for instance, A17-A19 as such all the A17* A18* A19* will be recoded into 8. And so on.
Is this possible in STATA?
Here is a sample of the original data, the string variable that I want to recoded it which contain the main ICD diagnosis codes.
So, I want to recode Diag_code equal 1 if Diag_code has A00* and 2 if Diag_code has A01*,,,,, 8 if Diag_code has anything of A17* or A18* or A19* and etc for the 298 causes.
I have spend couple days searching and trying to recode a string variable that contain letter and numbers (ICD10 Codes) into a digit number corresponding to the list of ICD10 of the 298 main causes of morbidity. I know that string variable cannot be recoded. In SPSS it is possible although the syntax is very lengthy. However, in STATA I was not able to even copy and past the codes because it is so many.
I will attach a sample of Excel file that contains the list of codes of the 298 codes and corresponding number to each cause. So, for example in row 1 column B the code is A00 and I want to recode it A00-A00.9 into 1, row two is A01 so A01* into 2, row 8 for instance, A17-A19 as such all the A17* A18* A19* will be recoded into 8. And so on.
Is this possible in STATA?
1 | A00 |
2 | A01 |
3 | A03 |
4 | A06 |
5 | A09 |
6 | A02, A04–A05, A07– A08 |
7 | A15–A16 |
8 | A17–A19 |
9 | A20 |
10 | A23 |
11 | A30 |
12 | A33 |
13 | A34–A35 |
14 | A36 |
15 | A37 |
16 | A39 |
17 | A40–A41 |
18 | A21–A22, A24–A28, A31–A32, A38, A42– A49 |
Here is a sample of the original data, the string variable that I want to recoded it which contain the main ICD diagnosis codes.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str4 Diag_code "O342" "R104" "K358" "J931" "O249" "I10" "N202" "I249" "K649" "O269" "O800" "I501" "M480" "C509" "Q211" "A239" "N133" "N939" "O800" "I64" "C543" "J988" "O800" "N210" "J459" "A239" "O034" "C169" "O800" "R104" "E111" "J189" "J069" "I739" "K409" "K566" "C509" "U071" "E114" "I214" "O800" "A153" "K358" "O441" "C509" "I219" "N309" "F311" "R572" "K358" end
So, I want to recode Diag_code equal 1 if Diag_code has A00* and 2 if Diag_code has A01*,,,,, 8 if Diag_code has anything of A17* or A18* or A19* and etc for the 298 causes.
Comment