I am using llama2 model to create a RAG based LLM with huggingface sentence-transformers/all-MiniLM-L6-v2 embedding and pinecone to store information as Vector database. I am feeding a pdf and am able to get results using similarity search on query from vectors in pinecone. I am facing issue in providing the best-k results as context in RetrievalQA function, specifically its retrieval argument:
qa=RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=??
return_source_documents=True,
chain_type_kwargs=chain_type_kwargs)
Here is the complete code:
import os
from pinecone import Pinecone
from pinecone import ServerlessSpec
# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = PINECONE_API_KEY
# configure client
pc = Pinecone(api_key=api_key)
cloud = os.environ.get('PINECONE_CLOUD') or 'aws'
region = os.environ.get('PINECONE_REGION') or 'us-east-1'
spec = ServerlessSpec(cloud=cloud, region=region)
index_name = 'semantic-search-fast'
import time
existing_indexes = [
index_info["name"] for index_info in pc.list_indexes()
]
# check if index already exists (it shouldn't if this is first time)
if index_name not in existing_indexes:
# if does not exist, create index
pc.create_index(
index_name,
dimension=384, # dimensionality of minilm
metric='dotproduct',
spec=spec
)
# wait for index to be initialized
while not pc.describe_index(index_name).status['ready']:
time.sleep(1)
# connect to index
index = pc.Index(index_name)
time.sleep(1)
# populating vectors
for i, t in zip(range(len(text_chunks)), text_chunks):
query_result = embeddings.embed_query(t.page_content)
index.upsert(
vectors=[
{
"id": str(i), # Convert i to a string
"values": query_result,
"metadata": {"text":str(text_chunks[i].page_content)} # meta data as dic
}
],
namespace="real"
)
question_embedding = embeddings.embed_query("What is acne?")
query_result = index.query(namespace="real",
vector=question_embedding,
top_k=3,
include_metadata=True
)
relevant_docs = [hit["metadata"]["text"] for hit in query_result["matches"]]
I tried different approaches like creating vectorstore using
vectorstore = Pinecone.from_existing_index(index_name=index_name, embedding=HuggingFaceEmbeddings(), namespace='real')
and then thought of using retriever=vectorstore.as_retriever() in the argument for RetrievalQA.
But initializing vectorstore throws an error stating :
__init__() missing 1 required positional argument: 'host'
and doesn’t work. I want to know if I could somehow get the above code to work the way it is, without the vectorstore
Samyak Khetan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.