Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching observations in two columns on stata

    I have the following variables countyname and countynamegini, I want to match each observation in countyname with that in countynamegini. For example I want to keep the observations Baldwin AL and Calhon in the variable countyname and drop the other observations that can be seen in the sample code below. How would I do this? I tried running a loop but I got an error due to wrong syntax

    [CODE]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str29 countyname str42 countynamegini
    "UNITED STATES" "Baldwin County, Alabama"
    "ALABAMA" "Calhoun County, Alabama"
    "Autauga, AL" "Cullman County, Alabama"
    "Baldwin, AL" "DeKalb County, Alabama"
    "Barbour, AL" "Elmore County, Alabama"
    "Bibb, AL" "Etowah County, Alabama"
    "Blount, AL" "Houston County, Alabama"
    "Bullock, AL" "Jefferson County, Alabama"
    "Butler, AL" "Lauderdale County, Alabama"
    "Calhoun, AL" "Lee County, Alabama"

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str29 countyname str42 countynamegini
    "UNITED STATES" "Baldwin County, Alabama"   
    "ALABAMA"       "Calhoun County, Alabama"   
    "Autauga, AL"   "Cullman County, Alabama"   
    "Baldwin, AL"   "DeKalb County, Alabama"    
    "Barbour, AL"   "Elmore County, Alabama"    
    "Bibb, AL"      "Etowah County, Alabama"    
    "Blount, AL"    "Houston County, Alabama"   
    "Bullock, AL"   "Jefferson County, Alabama" 
    "Butler, AL"    "Lauderdale County, Alabama"
    "Calhoun, AL"   "Lee County, Alabama"       
    end
    
    gen county= word(lower(countynamegini), 1)
    levelsof county, local(counties) sep("|") clean
    gen match = regexm(" " + lower(countyname) + " ", "['!?,\. ](`counties')['!?,\. ]")
    keep if match==1
    Res.:

    Code:
    . l
    
         +-------------------------------------------------------+
         |  countyname           countynamegini   county   match |
         |-------------------------------------------------------|
      1. | Baldwin, AL   DeKalb County, Alabama   dekalb       1 |
      2. | Calhoun, AL      Lee County, Alabama      lee       1 |
         +-------------------------------------------------------+

    Comment

    Working...
    X