This is a follow up question to: Error in Azure Cognitive Search Service when storing document page associated to each chunk extracted from PDF in a custom WebApiSkill
How do I store the vectors generated by AzureOpenAIEmbeddingSkill in indexer given my current setup:
- Custom WebApiSkill:
combined_list = [{'textItems': text, 'numberItems': number} for text, number in zip(chunks, page_numbers)]
# response object for specific pdf
response_record = {
"recordId": recordId,
"data": {
"subdata": combined_list
}
}
response_body['values'].append(response_record)
- Skillset definition:
{
...
"description": "Skillset to chunk documents and generating embeddings",
"skills": [
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "splitclean",
"description": "Custom split skill to chunk documents with specific chunk size and overlap",
"context": "/document",
"httpMethod": "POST",
"timeout": "PT30S",
"batchSize": 1,
"degreeOfParallelism": null,
"authResourceId": null,
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "subdata",
"targetName": "subdata"
}
],
"authIdentity": null
},
{
"name": "#2",
"description": "Skill to generate embeddings via Azure OpenAI",
"context": "/document/subdata/*",
"apiKey": "<redacted>",
"deploymentId": "embedding-ada-002",
"dimensions": null,
"modelName": "experimental",
"inputs": [
{
"name": "text",
"source": "/document/subdata/*/textItems"
}
],
"outputs": [
{
"name": "embedding",
"targetName": "vector"
}
],
"authIdentity": null
}
],
"cognitiveServices": null,
"knowledgeStore": null,
"indexProjections": {
"selectors": [
{
"parentKeyFieldName": "parent_id",
"sourceContext": "/document/subdata/*",
"mappings": [
{
"name": "chunk",
"source": "/document/subdata/*/textItems",
"sourceContext": null,
"inputs": []
},
{
"name": "vector",
"source": "/document/subdata/*/vector",
"sourceContext": null,
"inputs": []
},
{
"name": "title",
"source": "/document/metadata_storage_name",
"sourceContext": null,
"inputs": []
},
{
"name": "page_number",
"source": "/document/subdata/*/numberItems",
"sourceContext": null,
"inputs": []
}
]
}
],
"parameters": {
"projectionMode": "skipIndexingParentDocuments"
}
},
"encryptionKey": null
}
I get the following error in AzureOpenAIEmbeddingSkill
:
Web Api response status: 'Unauthorized', Web Api response details: '{"error":{"code":"401","message":"Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource."}}'