Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging two datasets with observations having different spellings

    Hello,

    I'm merging two datasets using two variables: state and district, but the spellings differ in the two datasets. How should I go about it?

    Example datasets:

    Dataset 1
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str16 statename str30 distname
    "Orissa" "Balasore"           
    "Orissa" "Bolangir"           
    "Orissa" "Cuttack"            
    "Orissa" "Dhenkanal"          
    "Orissa" "Ganjam"             
    "Orissa" "Kalahandi"          
    "Orissa" "Keonjhar"           
    "Orissa" "Koraput"            
    "Orissa" "Mayurbhanja"        
    "Orissa" "Phulbani(Kandhamal)"
    "Orissa" "Puri"               
    "Orissa" "Sambalpur"          
    "Orissa" "Sundargarh"         
    "Orissa" "Bhadrak"            
    "Orissa" "Jagatsinghapur"     
    "Orissa" "Jajapur"            
    "Orissa" "Kendrapara"         
    "Orissa" "Angul"              
    "Orissa" "Gajapati"           
    "Orissa" "Nuapada"            
    "Orissa" "Malkangiri"         
    "Orissa" "Nawarangpur"        
    "Orissa" "Rayagada"           
    "Orissa" "Khurda"             
    "Orissa" "Nayagarh"           
    "Orissa" "Boudh"              
    "Orissa" "Bargarh"            
    "Orissa" "Deogarh"            
    "Orissa" "Jharsuguda"         
    "Orissa" "Sonepur"            
    end
    Dataset 2

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str25 state str27 dist
    "Odisha" "Anugul"        
    "Odisha" "Balangir"      
    "Odisha" "Baleshwar"     
    "Odisha" "Bargarh"       
    "Odisha" "Baudh"         
    "Odisha" "Bhadrak"       
    "Odisha" "Cuttack"       
    "Odisha" "Debagarh"      
    "Odisha" "Dhenkanal"     
    "Odisha" "Gajapati"      
    "Odisha" "Ganjam"        
    "Odisha" "Jagatsinghapur"
    "Odisha" "Jajapur"       
    "Odisha" "Jharsuguda"    
    "Odisha" "Kalahandi"     
    "Odisha" "Kandhamal"     
    "Odisha" "Kendrapara"    
    "Odisha" "Kendujhar"     
    "Odisha" "Khordha"       
    "Odisha" "Koraput"       
    "Odisha" "Malkangiri"    
    "Odisha" "Mayurbhanj"    
    "Odisha" "Nabarangapur"  
    "Odisha" "Nayagarh"      
    "Odisha" "Nuapada"       
    "Odisha" "Puri"          
    "Odisha" "Rayagada"      
    "Odisha" "Sambalpur"     
    "Odisha" "Subarnapur"    
    "Odisha" "Sundargarh"    
    end
    Thanks




















  • #2
    There are some "fuzzy" merge functions, but I recommend substituting the names from one dataset into the other when different (within your do file so you have a record).

    Comment


    • #3
      Thanks, George.

      Comment

      Working...
      X