Problem with replacing

Nader Mehri

Join Date: Jun 2019
Posts: 189

Problem with replacing

25 Mar 2022, 10:50

Hi,
For the below sample data, I would like to replace abbreviations for state names with the full name (e.g., OR-->Oregon). The full state names are available from 1999 onwards. The fips code is the same for the state name. Thanks for any help.
Best,
NM

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float Year str20 state float fips
2000 "OR"             41
1997 "PA"             42
1997 "RI"             44
1997 "SC"             45
1997 "SD"             46
1997 "TN"             47
1997 "TX"             48
1997 "UT"             49
1997 "VT"             50
1997 "VA"             51
1997 "WA"             53
1997 "WV"             54
1997 "WI"             55
1997 "WY"             56
1998 "OR"             41
1998 "PA"             42
1998 "RI"             44
1998 "SC"             45
1998 "SD"             46
1998 "TN"             47
1998 "TX"             48
1998 "UT"             49
1998 "VT"             50
1998 "VA"             51
1998 "WA"             53
1998 "WV"             54
1998 "WI"             55
1998 "WY"             56
1999 "Oregon"         41
1999 "Pennsylvania"   42
1999 "Rhode Island"   44
1999 "South Carolina" 45
1999 "South Dakota"   46
1999 "Tennessee"      47
1999 "Texas"          48
1999 "Utah"           49
2000 "Vermont"        50
2000 "Virginia"       51
2000 "Washington"     53
2001 "West Virginia"  54
2002 "Wisconsin"      55
2002 "Wyoming"        56
end

Tags: data, foreach, replace, string

Clyde Schechter

Join Date: Apr 2014

Posts: 29801
#2

25 Mar 2022, 10:57

Code:

frame put state fips if length(state) > 2, into(reference) frame reference: duplicates drop frlink m:1 fips, frame(reference) replace state = frval(reference, state) if !missing(reference)

Note: This code will work provided every state that appears in the data with a 2 letter abbreviation also appears somewhere in the data with the full name. If that is not the case, those observations will be left with just the 2 letter abbreviation. If that happens, it will probably only be in a few cases and you can finish the job with a few -replace- statements.
1 like
Comment
Nader Mehri

Join Date: Jun 2019

Posts: 189
#3

25 Mar 2022, 11:00

Thanks. Using the code, I have got everything in a good shape.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29801
#4

25 Mar 2022, 11:00

Actually, there's a simpler way. The state abbreviations are upper case, and the names are in proper case. Since upper case letters sort before lower case letters, the abbreviation will always sort before the full name. So the following works:

Code:

by fips (state), sort: replace state = state[_N]

This code may not work, however, if the use of upper and lower case is not consistent in the way described. (It is consistent in the example data.)
1 like
Comment
Nader Mehri

Join Date: Jun 2019

Posts: 189
#5

25 Mar 2022, 11:13

Looks like the new code works too. Thanks. But I got an error when rerunning the former code: frame reference already defined
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29801
#6

25 Mar 2022, 12:30

But I got an error when rerunning the former code: frame reference already defined

It didn't happen the first time you ran the code, I'm betting, but it did happen later. I've stumbled on this often myself.

When you -clear- Stata either with the -clear- command itself (as at the top of the -dataex-) or by -use new_data_set, clear-, Stata does not remove existing frames. So if you try to run the code in #2 a second time, the frame reference is still there from the last time. But -frame put- requires that the frame specified be a new one. So the solution is to precede the -frame put- command by -frame drop reference-, or, before loading the data set, running -clear*-, which does clear frames.
1 like
Comment

Announcement

Problem with replacing

Comment

Comment

Comment

Comment

Comment