How to yield only newly generated text
I have a Flask application that uses a RAG pipeline in the background, and I stream responses from vLLM. I’m currently using a function to parse and yield the streamed JSON responses. However, I’m facing an issue where the streaming output includes previously generated text, rather than just the most recent content