I have been using azure ai search and scoring profiles to boost the documents of my index that come form the ‘reviewed’ source that means I want to send to the very TOP documents that have the string ‘reviewed’ on the field source, so I configured this scoring profile:
"scoringProfiles": [
{
"name": "pubs_peer_house",
"functionAggregation": "sum",
"text": {
"weights": {
"title": 3,
"content": 10
}
},
"functions": [
{
"fieldName": "source",
"interpolation": "linear",
"type": "tag",
"boost": 100,
"freshness": null,
"magnitude": null,
"distance": null,
"tag": {
"tagsParameter": "reviewed"
}
}
]
}
],
I have this code in python:
from azure.core.exceptions import HttpResponseError
# Define the search query
search_query = "cybersecurity in Arizona from tech today"
# Function to get the embedding of the search query
search_vector = get_embedding(search_query)
# Define the scoring parameters
scoring_parameters = ["reviewed-100"]
try:
# Perform the search using the Azure Cognitive Search client
response = search_client.search(
search_query,
top=5, # Return the top 5 results
vector_queries=[
VectorizedQuery(
vector=search_vector, # The vector representation of the search query
k_nearest_neighbors=5, # Number of nearest neighbors to find
fields="vector_space_search" # Field in the index to search the vector space
)
],
query_type="semantic", # Specify the query type as semantic
semantic_configuration_name=semantic_profile, # The name of the semantic configuration
scoring_profile='pubs_peer_house', # The scoring profile to use
scoring_parameters=scoring_parameters # The scoring parameters
)
# Iterate over the search results
for idx, doc in enumerate(response, start=1):
# Default values if fields are not found
found_content = "Not found"
# Extract fields from the search result
date = doc.get('date', 'N/A') # Date of the document
source = doc.get('source', 'N/A') # Source of the document
title = doc.get('title', 'N/A') # Title of the document
content = doc.get('content', 'N/A') # Content of the document
# Print the results
print(f"{idx}")
print(f"Score: {doc['@search.score']:.5f}")
print(f"Source: {source}")
print(f"Title: {title}")
print(f"Content: {content}nn")
except HttpResponseError as e:
# Handle HTTP response errors
print(f"HTTP Response Error: {e.message}")
print(f"Details: {e.response}")
except Exception as ex:
# Handle other exceptions
print(f"An error occurred: {ex}")
Nontheless when I ask for anything and I implemente my profiles scoring along with the semantic ranker in a hybrid search it doesnt matter the value of the booster I always get the same results
look:
1 Score: 0.03667 Source: americas Subtitle: What level of
cybersecurity do you have? Content: We comply with industry standards
for cybersecurity and recommend that you…2 Score: 0.02639 Source: reviewed Subtitle: What do I need to operate
securely? Content: The key or security signature. It is an 8-digit
alphanumeric code ML te….3 Score: 0.01562 Source: europe Subtitle: Passkey password and pin
still better than faceID Content: careful whose face do you trust….
even with params like : scoring_parameters = ["reviewed-2500000"]
I still get:
1 Score: 0.03667
Source: americas
Subtitle: What level of
cybersecurity do you have? Content: We comply with industry standards
for cybersecurity and recommend that you…2 Score: 0.02639 Source: reviewed Subtitle: What do I need to operate
securely? Content: The key or security signature. It is an 8-digit
alphanumeric code ML te….3 Score: 0.01562 Source: europe Subtitle: Passkey password and pin
still better than faceID Content: careful whose face do you trust….
Am I doing something wrong I cant seem to find a tutorial on this in python online. Thank you so much for all of your help guys.