Maybe my question is similar to this: Azure Open AI failing to read Search Index?
I have developed a Bot with the ability to conduct a questionnaire in Jupyter Notebook using configuration and endpoints of Azure Open AI. It worked well. Now I want the Bot to use my own data (from 4 JSON arrays in .txt fotmar). It worked well in Playground. But now I want to create an Index in Azure portal (if possible in Jupyter, I would appreciate your input. I know it can also be done REST API in Visual Code).
For this, I want to apply Map JSON fields to search fields. I’ve read the Quickstart: Create a search index in the Azure portal. But I may be doing something wrong because it does not work. I do also believe that the proper way would be to merge or store the 4 JSON arrays into, with the respective subfields such as the json example single hotel JSON files.
Also, I believe that my JSON arrays are incorrect or invalid. What code is useful to fix it in Python (Jupyter Notebook)? See this example
The array of my data JSON are parsed like this:
- 1st file have two fields (with 46 indicators) looks like this:
[
{
"indicator":"xxxx",
"Definition":"xxx"
},
{
"indicator":"xxxx",
"Definition":"xxx"
},
{
"indicator":"xxx",
"Definition":"xxx"
},
{
...
}
]
- 2nd file with ca. 40 questions):
[
{
"question":1,
"explanation":"xxx"
},
{
"question":2,
"explanation":"xxx"
},
{
"question":n,
"explanation":"xxx"
}
]
- 3rd file (with ca. 20 indicators):
{
"Category":"xxx",
"Mitigation measure":"xxx"
},
{
"Category":"xxx",
"Mitigation measure":"xxx"
}
]
- And the 4th and most important file contains the following subfields (with ca. 200 site ids). I have them both in one JSON array and also in separated JSON arrays, having a total of 200 files:
{
"Site ID": 1,
"Site Name": "xxxx",
"BU RC Name": "xxx",
"Country": "xxx",
"Street": "xxx",
"postal code City": "xxx",
"Total Overall water risk": "Low - Medium (1-2)",
"Physical risk quantity": "High (3-4)",
"Physical Risk Quantity": "High (3-4)",
"Regulatory and reputational risk": "Low (0-1)",
"Baseline Water Stress": "High (40-80%)",
"Baseline Water Depletion": "Low - Medium (5-25%)",
"Interannual variability": "Low - Medium (0.25-0.50)",
"Seasonal Variability": "Low (<0.33)",
"Groundwater table decline": "Insignificant Trend",
"Riverine flood risk": "Low (0 to 1 in 1000)",
"Coastal flood risk": "Low (0 to 9 in 1000000)",
"Drought risk": "Medium (0.4-0.6)",
"Untreated Connected Wastewater": "Low (<30%)",
"Coastal eutrophication potential": "High (1 to 5)",
"Unimproved/No Drinking Water": "Low (<2.5%)",
"Unimproved/no sanitation": "Low (<2.5%)",
"Peak RepRisk country ESG risk index": "Low (<25%)",
"Water Supply Optimistic 2030": "Near normal",
"Water Demand Optimistic 2030": "1.2x decrease",
"Seasonal Variability Optimistic 2030": "Near normal",
"Water Stress Optimistic 2040": "1.4x decrease",
"Water Supply Optimistic 2040": "Near normal",
"Water Demand Optimistic 2040": "1.4x decrease",
"Seasonal Variability Optimistic 2040": "1.1x increase",
"Water Stress Business as Usual 2030": "1.4x decrease",
"Water Supply Business as Usual 2030": "Near normal",
"Water Demand Business as Usual 2030": "1.2x decrease",
"Seasonal Variability Business as Usual 2030": "Near normal",
"Water Stress Business as Usual 2040": "1.4x decrease",
"Water Supply Business as Usual 2040": "Near normal",
"Water Demand Business as Usual 2040": "1.4x decrease",
"Seasonal Variability Business as Usual 2040": "Near normal",
"Water Stress Pessimistic 2030": "1.4x decrease",
"Water Supply Pessimistic 2030": "Near normal",
"Water Demand Pessimistic 2030": "1.2x decrease",
"Seasonal Variability Pessimistic 2030": "Near normal",
"Water Stress Pessimistic 2040": "1.4x decrease",
"Water Supply Pessimistic 2040": "Near normal",
"Water Demand Pessimistic 2040": "1.4x decrease",
"Seasonal Variability Pessimistic 2040": "Near normal"
}
I want to be able migrate the index (stored in my Azure Open AI platform) for Phyton (Jupyter Notebook) using this code:
import os
import openai
import dotenv
dotenv.load_dotenv()
endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
api_key = os.environ.get("AZURE_OPENAI_API_KEY")
deployment = os.environ.get("AZURE_OPEN_AI_DEPLOYMENT_ID")
client = openai.AzureOpenAI(
base_url=f"{endpoint}/openai/deployments/{deployment}/extensions",
api_key=api_key,
api_version="2023-08-01-preview",
)
completion = client.chat.completions.create(
model=deployment,
messages=[
{
"role": "user",
"content": "How is Azure machine learning different than Azure OpenAI?",
},
],
extra_body={
"dataSources": [
{
"type": "AzureCognitiveSearch",
"parameters": {
"endpoint": os.environ["AZURE_AI_SEARCH_ENDPOINT"],
"key": os.environ["AZURE_AI_SEARCH_API_KEY"],
"indexName": os.environ["AZURE_AI_SEARCH_INDEX"]
}
}
]
}
)
print(completion.model_dump_json(indent=2))