I’m trying to use Langchain’s MapReduceDocumentsChain
in combination with OpenAI’s API to summarize a the content of a big document through the following chain (which was copied from Langchain’s documentation):
def _create_document_summary_chain(self) -> LLMChain:
"""
Create the summarization chain
"""
map_chain = LLMChain(
llm=self._quick_scan_model.llm, prompt=SummaryPrompt.get_document_summary_map_prompt()
)
reduce_chain = LLMChain(
llm=self._quick_scan_model.llm, prompt=SummaryPrompt.get_document_summary_reduce_prompt()
)
combine_documents_chain = StuffDocumentsChain(
llm_chain=reduce_chain, document_variable_name="docs"
)
reduce_documents_chain = ReduceDocumentsChain(
combine_documents_chain=combine_documents_chain,
collapse_documents_chain=combine_documents_chain,
token_max=4000
)
return MapReduceDocumentsChain(
llm_chain=map_chain,
reduce_documents_chain=reduce_documents_chain,
document_variable_name="docs",
return_intermediate_steps=False,
)
The call to the chain is being made through the following function, which takes in a list of Langchain Document
objects that have been chunked beforehand:
async def get_document_summary(self, chunks: list[Document]) -> str:
"""
Get the summary for a given document text. Use the Langchain map reduce summarizer.
"""
for i in range(self._retries):
try:
response = await self._summary_map_reduce_chain.ainvoke(chunks)
return response['output_text']
except Exception as e:
print(
f"Document summarizer attempt {i + 1}/{self._retries} failed...")
print(f"Error: {e}")
continue
return ""
The problem is that when running the chain on big documents (> 500 chunks), i get TPM exceeded error like follows:
An error occurred: RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-xxx on tokens per min. Limit: 90000 / min. Current: 89369 / min. Contact us through our help center at help.openai.com if you continue to have issues.
I tried looking around but no one seems to have the same problem as I do, or at least not when using the same chain as I am.
I also tried playing around with the chunks’ size and the token_max
parameter, but to no avail.