I have a table that consists of vehicle plates, where all the plates have always 7 characters.
Sometimes a plate was miswritten and so my SELECT query must be able to consider that “AAU1234″ and “AAV1234″ must be the same vehicle.
I don’t want to create rules between specific characters, like “U” and “V” or “I” and “1”, but a rule able to group plates where 6 of the 7 characters are equal and in the same position.
eg.:
In this case, ids 1, 2 and 5 should appear only once.
It would be great if, when grouped, all the grouped plates were informed, concatened in another column.
More information: This is a big table and currently identical plates are grouped when inside the same date-time range of 15 minutes.
My query looks like this:
select plate, floor(unix_timestamp(date)/(15 * 60)) as timekey from table group by plate, timekey order by date desc
Following the image example above, my goal is to group ids 1, 2 and 5 in the same row, since id 1 and id 2 matches 6 of 7 strings and so id 1 and id 5.
The result could be something like:
Or:
The final date information is not important here, the most important thing here is to group the similar plates.