I am trying to run a QA on a small document. The llm I am using is: gpt-3.5-turbo-instruct.
The retriever I am using is: FAISS
My langchain version is 0.2.6.
When I use RetrievalQA from langchain works:
from langchain.chains import RetrievalQA, RetrievalQAWithSourcesChain
qa = RetrievalQA.from_chain_type(
llm=oaiConf.llm, retriever=ret, chain_type="stuff"
)
resp = qa.invoke(testQuery)
This throws an error:
qa = RetrievalQAWithSourcesChain.from_chain_type(
llm=llm, chain_type="stuff", retriever=ret
)
resp = qa(testQuery)
openai.BadRequestError: Error code: 400 – {‘error’: {‘message’: “This model’s maximum context length is 4097 tokens, however you requested 4586 tokens (4330 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.”, ‘type’: ‘invalid_request_error’, ‘param’: None, ‘code’: None}}