Remove chunks of indexed blob from index Azure AI Search
So I have an Azure AI Search service. It uses a blob storage as a source for files that are indexed and used for search.
The workflow is the following:
product-data
is the storage for original files that go through indexer it indexes the document_id
, filename
and url
.
There also is storage for file chunks that is called product-chunks
. For each document a folder is generated. Its name is a hash value so there is no link to the original file. It stores files content splitted into chunks (JSONs). This storage is also indexed by the same fields + content
, chunk_id
, file_path
so that ai could find some text.