I want know ‘Why my chatbot generates and types the answer of each question very late than the normal one? I explain you more in detail below:
I have created the chatbot application which has below dependencies:
python 3.10.10
ollama 0.3.6 (https://ollama.com/)
chromadb==0.5.3
streamlit==1.36.0
langchain_core==0.2.9
langchain_community==0.2.5
PyPDF2
pypdf==4.2.0
langdetect==1.0.9
Since my goal is to run the application on the server, I have done the Docker Containerization. For that, I have created the Dockerfile
, docker-compose.yml
and start.sh
.
Dockerfile
is containing all dependencies installing process, exposing chatbot ports and so on.- docker-compose.yml contains the combination of two services. One for Ollama Container and Second for chatbot application. In Ollama Service, the file
start.sh
also gets executed. start.sh
contains the name of the Ollama (which starts the Ollama service inside the container), Ollama models and they are going to be pulled which execution of thestart.sh
file.
At the end, on my local computer, when I execute the command docker-compose up --build
, it executes the containerization process, installs Ollama, start the ollama server and pull ollama models inside the docker container. At the end of the process, I open http://localhost:8501 to start the chatbot application. Then I upload the document, write the question and hit enter to get the answer.
About the Answer: I have recieved the answer of a respective question after very long time compare to the scenario where I use the chatbot application on my local computer without the docker container. This is a bit wierd to me.
Why I am receiving answers very late when I execute the answer very late while executing the application on the docker container in compare to the application execution without the docker container.
For your reference, I have added my docker.compose.yml file as below:
services:
ollama:
container_name: ollama_v5
image: ollama/ollama:latest
restart: unless-stopped
volumes:
- "./ollamadata:/root/.ollama"
- "./start.sh:/start.sh" # Mount the script into the container
ports:
- "11434:11434"
entrypoint: /start.sh
networks:
- ollama_network
chatbot:
container_name: chatbot_v5
build:
context: ./ # The directory where Dockerfile and code are located
dockerfile: Dockerfile
restart: unless-stopped
environment:
- BASE_URL=http://ollama:11434 # Chatbot will access the Ollama API
ports:
- "8501:8501"
depends_on:
- ollama
networks:
- ollama_network
networks:
ollama_network:
driver: bridge
2
Do you have any specific options for limit resources?
Something like:
resources:
limits:
cpus: '0.5' # Limit to 50% of one CPU
memory: 512M # Limit to 512 MB of memory
If you want try to set container in privileged mode:
services:
my_service:
image: my_image
privileged: true # Enable privileged mode for the container
If this is not helping you maybe have some problems in the docker bridge network
7