Facets Issue with Importing and Vectorizing Data in Azure Search
I’m encountering an issue while working with importing and vectorizing data in my project. After importing and vectorizing data, I noticed that I have an excessive number of chunks, which is fine as it helps me stay under the maximum token input limits of embedding models. Where I encounter the issue is when I want to create filtering by metadata_content_type. I marked the field as facetable, and as a result, for 1 PPTX file, I am getting a count of 19 for the value “pptx”. I guess it is because the engine sees chunks as documents and doesn’t distinguish them in any manner. Is there something I can do about this as I need to have embeddings on large documents which means chunking is required but also I need to have filtering with facets as well?