Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with replacing

    Hi,
    For the below sample data, I would like to replace abbreviations for state names with the full name (e.g., OR-->Oregon). The full state names are available from 1999 onwards. The fips code is the same for the state name. Thanks for any help.
    Best,
    NM
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float Year str20 state float fips
    2000 "OR"             41
    1997 "PA"             42
    1997 "RI"             44
    1997 "SC"             45
    1997 "SD"             46
    1997 "TN"             47
    1997 "TX"             48
    1997 "UT"             49
    1997 "VT"             50
    1997 "VA"             51
    1997 "WA"             53
    1997 "WV"             54
    1997 "WI"             55
    1997 "WY"             56
    1998 "OR"             41
    1998 "PA"             42
    1998 "RI"             44
    1998 "SC"             45
    1998 "SD"             46
    1998 "TN"             47
    1998 "TX"             48
    1998 "UT"             49
    1998 "VT"             50
    1998 "VA"             51
    1998 "WA"             53
    1998 "WV"             54
    1998 "WI"             55
    1998 "WY"             56
    1999 "Oregon"         41
    1999 "Pennsylvania"   42
    1999 "Rhode Island"   44
    1999 "South Carolina" 45
    1999 "South Dakota"   46
    1999 "Tennessee"      47
    1999 "Texas"          48
    1999 "Utah"           49
    2000 "Vermont"        50
    2000 "Virginia"       51
    2000 "Washington"     53
    2001 "West Virginia"  54
    2002 "Wisconsin"      55
    2002 "Wyoming"        56
    end

  • #2
    Code:
    frame put state fips if length(state) > 2, into(reference)
    frame reference: duplicates drop
    frlink m:1 fips, frame(reference)
    replace state = frval(reference, state) if !missing(reference)
    Note: This code will work provided every state that appears in the data with a 2 letter abbreviation also appears somewhere in the data with the full name. If that is not the case, those observations will be left with just the 2 letter abbreviation. If that happens, it will probably only be in a few cases and you can finish the job with a few -replace- statements.

    Comment


    • #3
      Thanks. Using the code, I have got everything in a good shape.

      Comment


      • #4
        Actually, there's a simpler way. The state abbreviations are upper case, and the names are in proper case. Since upper case letters sort before lower case letters, the abbreviation will always sort before the full name. So the following works:
        Code:
        by fips (state), sort: replace state = state[_N]
        This code may not work, however, if the use of upper and lower case is not consistent in the way described. (It is consistent in the example data.)

        Comment


        • #5
          Looks like the new code works too. Thanks. But I got an error when rerunning the former code: frame reference already defined

          Comment


          • #6
            But I got an error when rerunning the former code: frame reference already defined
            It didn't happen the first time you ran the code, I'm betting, but it did happen later. I've stumbled on this often myself.

            When you -clear- Stata either with the -clear- command itself (as at the top of the -dataex-) or by -use new_data_set, clear-, Stata does not remove existing frames. So if you try to run the code in #2 a second time, the frame reference is still there from the last time. But -frame put- requires that the frame specified be a new one. So the solution is to precede the -frame put- command by -frame drop reference-, or, before loading the data set, running -clear*-, which does clear frames.

            Comment

            Working...
            X