Let’s say I have two tables:
city | street_1 |
---|---|
Toronto | 123 Banana St |
New York | 69 Good Time Rd |
Lost Angeles | 420 Blaze It Dr |
Dallas | 888 Wonderful Line |
And
street_2 | fruit |
---|---|
123 Bannana St | Grapes |
69 Good Tme Rd | Apple |
420 Blaeze It Dr | Banana |
888 Wonderaful Line | Pear |
I want to join the above two tables on street_1 and street_2. How would I go about that since the values in both street_1 and street_2 are different? Levenshtein is out of question since it is going to take way too long.
Thanks mateys.
5
If you need to make a JOIN
on such data structure, your tables are wrong. You’ll depend on user input correctness to make it work the way you expect, and we know user input is not reliable at all.
The proper fix is to normalize your database. Introduce a third table that will contain addresses, each with its own id. Your original tables won’t contain the streets themselves, but only the id, which is internal to the system and not subject to user input. That way your street names are contained in a single place and you don’t rely on both places being equal.