I’m developing a Streamlit application that processes a large PDF (273 pages) and integrates a chatbot using embeddings and the GPT-4 model (model='text-embedding-ada-002'
). Initially, I used PyPDFium2Loader
successfully but encountered issues when switching to AzureAIDocumentIntelligenceLoader mode=single
.
Setup:
-
Extract text from the PDF using
PyPDFium2Loader
with chunk settings (chunk_size=10000
,chunk_overlap=1000
). -
Initialize a chatbot using
model='text-embedding-ada-002'
and interact via a Streamlit interface.
Issue:
-
With
PyPDFium2Loader
, the vector count is 276, and the chatbot functions correctly. -
Switching to
AzureAIDocumentIntelligenceLoader
(mode = single), the vector count drops to 114, and I consistently receive a 429 error (Too Many Requests) when testing with a single message like “hi”.
Additional Context:
-
Azure Document Intelligence API has a rate limit of 10 transactions per second (tps).
-
I’ve implemented retry logic with exponential backoff and rate limiting in Python.
Questions:
-
Why does
AzureAIDocumentIntelligenceLoader
show a lower vector count compared toPyPDFium2Loader
? -
Could the method
AzureAIDocumentIntelligenceLoader
processes the document contribute to the 429 errors? -
How can I effectively manage rate limit errors when using Azure API with GPT-4 embeddings and large document sizes?
Snippet –
def extract_embeddings_upload_index(pdf_path, index_name):
print(f"Loading PDF from path: {pdf_path}")
# Load PDF documents
txt_docs = AzureAIDocumentIntelligenceLoader( api_key,file_path=pdf_path,api_endpoint,api_model="prebuilt-layout",mode="single").load()
#total_pages=txt_docs
#print(f'{total_pages}')
#txt_docs = PyPDFium2Loader(pdf_path).load()
# Split documents
print("Splitting documents...")
splt_docs = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=1000)
docs = splt_docs.split_documents(txt_docs)
print(f"Split into {len(docs)} chunks")
# Initialize OpenAI embeddings
print("Initializing OpenAI embeddings...")
embeddings = OpenAIEmbeddings(model='text-embedding-ada-002')
# Upload documents to Pinecone index
print("Initializing Pinecone Vector Store...")
dbx = PineconeVectorStore.from_documents(documents=docs, index_name=index_name, embedding=embeddings)
print(f"Uploaded {len(docs)} documents to Pinecone index '{index_name}'")