I am trying to use Streamlit. The first prompt works, but when I pass the second or third prompt, it is throwing this error. It was working before but has started showing this error recently. I am not sure why there was a quota limit I should be aware of during the streamlit session after a few usages. I have even checked the way the prompts are going, I even tried to NOT use create_history_aware_retriever
method and go with a streamlit chatbot without memory.
It still fails with the following error:
oci.exceptions.ServiceError: {‘target_service’: ‘generative_ai_inference’, ‘status’: 400, ‘code’: ‘400’, ‘opc-request-id’: ’62F4609E6A4349A3B312ED1D73380147/94883F1D68C1BD5711
8373ED41B882E3/A7E92505DFB5E7811684526204E71E51′, ‘message’: ‘{“message”:”too many tokens: total number of tokens in the prompt cannot exceed 4081 – received 20861. Try using
a shorter prompt, or enabling prompt truncating. See https://docs.cohere.com/reference/generate for more details.”}’, ‘operation_name’: ‘generate_text’, ‘timestamp’: ‘2024-05-
25T16:12:53.250556+00:00’, ‘client_version’: ‘Oracle-PythonSDK/2.126.4’, ‘request_endpoint’: ‘POST https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/act
ions/generateText’, ‘logging_tips’: ‘To get more info on the failing request, refer to https://docs.oracle.com/en-us/iaas/tools/python/latest/logging.html for ways to log the
request/response details.’, ‘troubleshooting_tips’: “See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_400__400_400 for more information about re
solving this error. Also see https://docs.oracle.com/iaas/api/#/en/generative-ai-inference/20231130/GenerateTextResult/GenerateText for details on this operation’s requirements. If you are unable to resolve this generative_ai_inference issue, please contact Oracle support and provide them this full error message.”}
Here is the code (check my answer in it):
CohereAPIError: too many tokens: total number of tokens in the prompt cannot exceed 4081 – received 15416