I am using Completions in openai library for python.
Something like this(simplified code just to understand what I am doing):
self.__response = self.client.chat.completions.create(
model='gpt-4',
messages=messages,
stream=True
)
After this I just loop through chunks:
for chunk in self.__response:
text = chunk.choices[0].delta.content
# Processing text here
Is it enough to just do break
inside the loop to prevent server generating response and wasting tokens if I see that the response is not meeting my expectations? Or probably there is correct way to achieve this?
Thanks much for your help in advance.