I’ve build a RAG using AI-Search and OpenAI that generally works fine however inline sources are not returned accurately. The response object always includes citations but these are just the X results from the AI-Search vector search. At this point you do not know which of these 5 or 10 sources are actually used to answer the question.
Usually the actual answer includes inline citations like [doc2]
. I use regex to find these tags, extract the number and return the second document from my citations object. With this method I can specify only the sources that are actually used.
The problem is that this is incredibly unreliable for me. At least 7/10 times there is no [docX]
tag anywhere in the actual response and therefore no sources are listed.
For now I just list all sources and call it “potential sources” but this is less than ideal.
I currently use this package version:
openai == 1.14.2
And this is my code:
def ask_llm_citation(USER_INPUT:str, history: list, config: dict):
def parse_multi_columns(columns: str) -> list:
if "|" in columns:
return columns.split("|")
else:
return columns.split(",")
# Add the Conversation History to the prompt
messages = []
for question, answer in history[-AZURE_OPENAI_CONVERSATION_HISTORY:]:
messages.append({ "role": "user", "content": question })
messages.append({ "role": "assistant", "content": answer })
# Add the latest user input to the prompt
messages.append({ "role": "user", "content": USER_INPUT })
client = openai.AzureOpenAI(
base_url=f"{OPENAI_API_BASE}/openai/deployments/{OPENAI_DEPLOYMENT_NAME}/extensions",
api_key=OPENAI_API_KEY,
api_version=OPENAI_API_VERSION_CITATION
)
response = client.chat.completions.create(
messages=messages,
model=OPENAI_DEPLOYMENT_NAME,
temperature=OPENAI_API_GPT_TEMPERATURE,
seed=12345,
extra_body={
"dataSources": [
{
"type": "AzureCognitiveSearch",
"parameters": {
"endpoint": AZURE_COGNITIVE_SEARCH_ENDPOINT,
"key": AZURE_COGNITIVE_SEARCH_KEY,
"indexName": AZURE_COGNITIVE_SEARCH_INDEX_NAME,
"fieldsMapping": {
"contentFields": parse_multi_columns("content"),
"urlField": "url_name",
"filepathField": "file_name",
"vectorFields": parse_multi_columns("content_vector")
},
"embeddingDeploymentName": OPENAI_API_DEPLOYMENT_NAME_EMBEDDING,
"query_type": AZURE_COGNITIVE_SEARCH_QUERY_TYPE,
"inScope": True,
"roleInformation": AZURE_OPENAI_SYSTEM_MESSAGE,
"topNDocuments": AZURE_COGNITIVE_SEARCH_NR_DOCUMENTS,
"strictness": AZURE_COGNITIVE_SEARCH_STRICTNESS
}
}
]
},
stream=True,
)
for chunk in response:
try:
yield chunk.choices[0].delta
except:
pass
Does anyone know if there is a fix to it or another method to pinpoint the sources that were actually used to answer the question if you compare the given llm answer with the 5 or 10 documents from AI-Search?