Hi everyone,
I need to do a challenging task that is taking so much time. I need to match string variables from one dataset to another one. Problem: String are written in different ways across datasets
To do that, I have two different datasets:
Demand:
Thank you in advance!
Best,
Michael
I need to do a challenging task that is taking so much time. I need to match string variables from one dataset to another one. Problem: String are written in different ways across datasets
To do that, I have two different datasets:
- The first one is about the new vehicles that have been sold in Spain, from 2015 to 2019. Unfortunately, the name conventions of car models are not precise, so it's difficult to know which model households have bought (Demand).
- The second contains exhaustive information on the models available for purchase in Spain, over several years up to 2018. Here, the model names are extremely precise (Prices)
Demand:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str23 description str31 model str8 FEC_MATRICULA str1 COD_PROPULSION long cilindrada "ALFA ROMEO" "ALFA GIULIETTA" "12062017" "0" 1368 "ALFA ROMEO" "STELVIO" "29112019" "1" 2143 "ALFA ROMEO" "ALFA GIULIETTA" "19122018" "6" 1368 "ALFA ROMEO" "ALFA GIULIETTA" "15102015" "0" 1368 "ALFA ROMEO" "ALFA GIULIETTA" "19072017" "1" 1598 "ALFA ROMEO" "STELVIO" "02012019" "1" 2143 "ALFA ROMEO" "STELVIO" "08012019" "1" 2143 "ALFA ROMEO" "STELVIO" "11022019" "0" 1995 "ALFA ROMEO" "ALFA GIULIETTA" "17122015" "1" 1598 "ALFA ROMEO" "STELVIO" "15122017" "1" 2143 "ALFA ROMEO" "ALFA GIULIETTA" "16082019" "0" 1368 "ALFA ROMEO" "STELVIO" "22062018" "1" 2143 "ALFA ROMEO" "ALFA GIULIETTA" "16052019" "0" 1368 "ALFA ROMEO" "GIULIA" "21082019" "0" 1995 "ALFA ROMEO" "ALFA GIULIETTA" "30112018" "0" 1368 end
- description: Car Brand
- model: Car Model
- FEC_MATRICULA: Date of Registration
- COD_PROPULSION: 0 = Gasoline (Petrol); 1 = Diesel
- cilindrada: Cubic Capacity
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str13 description str77 model str9 periodocomercial int cc str3 gd long price "ALFA ROMEO" "GIULIETTA 1.4 TB Distinctive TCT 170" "2010-2016" 1368 "G" 20600 "ALFA ROMEO" "GIULIETTA 1.7 TBi QV" "2010-2016" 1742 "G" 22600 "ALFA ROMEO" "GIULIETTA 1.6 JTD m2 S.E.Veloce" "2010-2016" 1598 "D" 18300
Code:
"ALFA ROMEO" "Stelvio 2.0 PETROL 280 CV EXECUTIVE AWD" "2018-" 1995 "G" 44600 "ALFA ROMEO" "Stelvio 2.0 PETROL 280 CV SPECIALE AWD" "2017-" 1995 "G" 45000 "ALFA ROMEO" "Stelvio 2.0 Turbo 200 Executive Q4 Aut. 8V" "2018-" 1995 "G" 39500 "ALFA ROMEO" "Stelvio 2.0 Turbo 200 Super Q4 Aut." "2017-" 1995 "G" 36600 "ALFA ROMEO" "Stelvio 2.0 Turbo 280 Executive Q4 Aut. 8V" "2018-" 1995 "G" 43200 "ALFA ROMEO" "Stelvio 2.0 Turbo 280 First Edit.II Q4 Aut." "2017-" 1995 "G" 44600 "ALFA ROMEO" "Stelvio 2.2 D 150 RWD Aut." "2017-" 2143 "D" 33000 "ALFA ROMEO" "Stelvio 2.2 D 180 Q4 Aut." "2017-" 2143 "D" 36900 "ALFA ROMEO" "Stelvio 2.2 D 180 RWD Aut." "2017-" 2143 "D" 35100 "ALFA ROMEO" "Stelvio 2.2 D 180 Super Q4 Aut." "2017-" 2143 "D" 38600 "ALFA ROMEO" "Stelvio 2.2 D 180 Super RWD Aut." "2017-" 2143 "D" 36700 end
- description: Car Brand
- model: Car Model
- periodocomercial: Commercial Period
- cc: Cubic Capacity
- gd: G = Gasoline (Petrol); D = Diesel
- price: Car model prices
Thank you in advance!
Best,
Michael
Comment