Using the right embedding for llama2
I have this below code that works well when I connect to OpenAI. I am trying to do the same using llama2 but I am having trouble using the right embedding technique.. In the below code I use OpenAIEmbeddings
, want some help to find how can this be tweaked to run well using llama2 model
Ollama in local and remote (vps) docker instance stream their chunk replies with different sizes of tokens
I have set up the ollama docker image on docker compose locally and on a vps.
Same image (ollama/ollama:latest), same ollama configurations (no env variables), same model (llama3:8b).