I have a list of c.11k strings that I need to partially match to string IDs within a BigQ table.
T1 is the original BQ table and T2 is a table with a column of the 11,000 strings
At the minute I have following query
Select T1.ID,T2.ID from T1
JOIN T2 on T1.ID LIKE '%' || T2.ID || '%'
I have tested this on a subset of the data and it works as expected producing
T1 | T2 |
---|---|
15688463 | 56884 |
But I am aware that when I scale this to the full population it will be a very expensive query.
Is there a way that I can optimise this query or indeed another method for getting the result (other than splitting the 11,000 into subsets),
Thanks in advance