Relative Content

Tag Archive for large-language-modelllamachromadb

Why RAG is slower than LLM?

I used RAG with LLAMA3 for AI bot. I find RAG with chromadb is much slower than call LLM itself.
Following the test result, with just one simple web page about 1000 words, it takes more than 2 seconds for retrieving: