I developed a crawler, but I am struggling dockerizing it with database. The source code is available at: https://github.com/brunolnetto/RF_CNPJ/tree/dockerizer-crawler. It is not easy to have an implementation compendium, but I will try:
- The crawler is a script on
src/main.py
; - The docker services are called
crawler
and the databasecrawler-db
; - I set the required variables on environment variable
.env.template
, but you can rename it.env
; - The Dockerfile has a statement
* * * * * python3 /app/src/main.py >> /app/logs/cron.log 2>&1
, which will echo to a local cron file the required cron job settings.
The first issue regard the script run: it should populate the database, but cannot even run the script correctly. Additionally, sqlalchemy cannot resolve the database name to postgres as a docker service.
crawler | Traceback (most recent call last):
crawler | File "/app/src/main.py", line 8, in <module>
crawler | from setup.logging import logger
crawler | File "/app/src/setup/logging.py", line 5, in <module>
crawler | from dotenv import load_dotenv
crawler | ModuleNotFoundError: No module named 'dotenv'
You can reproduce the issue with command run docker compose build && docker compose up
.
Any help is apprecited. Thanks, guys!