I’m using the GenAI API (GPT, Claude) to create a conversational AI that handles multi-turn dialogues. My goal is to maintain the context of the conversation without resending all previous prompts with each new request, as this approach quickly becomes expensive due to token usage.
What I Have Tried
Currently, I append each new user message and the assistant’s responses to a list and send this entire list with each new API call:
conversation_history.append({"role": "user", "content": user_message}) response = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=conversation_history) conversation_history.append({"role": "assistant", "content": response['choices'][0]['message']['content']}) ......
My Question
Is there a way to maintain the context of a conversation with the API without needing to resend all previous prompts and responses in each request? Ideally, I’m looking for a method to retain the conversational state on the server side or a more efficient way to manage the context.
- Using GPT or Claude API.
- The primary concern is minimizing token usage to reduce costs.
endale is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.