Related topic: C++ Faiss – How to search in subsets
I want to do something close to this post.
The idea is to exclude from the search multiple fitted vector for each query vector.
I would imagine something like that
filter_ids = [[1, 2], [0, 1], [2, 3], [2, 3]] # this is not supported by faiss
id_selector = faiss.IDSelectorNot(faiss.IDSelectorArray(filter_ids))
filtered_distances, filtered_indices = index.search(query_vector, k, params=faiss.SearchParametersIVF(sel=id_selector))
where:
- fit_vector 1 and 2 are excluded from the neighbors’ search of the query_vector 0
- fit_vector 0 and 1 are excluded from the neighbors’ search of the query_vector 1
- fit_vector 2 and 3 are excluded from the neighbors’ search of the query_vector 2
- fit_vector 2 and 3 are excluded from the neighbors’ search of the query_vector 4
If impossible, I should create a different search query for each vector, which could be very slow for large data frames. Another option is to query with the filter_ids and remove them after the query search, but it is slow for large datasets too.
Do you recommend something else?