I’m running a Docker Compose setup with Spark and Kafka, but I’m encountering connectivity issues when Spark tries to connect to Kafka. My docker-compose.yml file includes services for both Kafka and Spark, and everything seems to be set up correctly. However, when running a Spark job that consumes data from Kafka, I get the following errors:
24/12/13 23:28:02 WARN NetworkClient: [AdminClient clientId=adminclient-1] Connection to node 1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
version: '3.8'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-kafka:latest
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
spark:
image: bitnami/spark:latest
ports:
- "8080:8080"
volumes:
- 'C:/docker-proyecto/spark-data:/opt/spark'
environment:
- SPARK_MODE=master
- SPARK_RPC_AUTHENTICATION_ENABLED=no
spark-worker:
image: bitnami/spark:latest
environment:
- SPARK_MODE=worker
- SPARK_MASTER_URL=spark://spark-master:7077
- SPARK_WORKER_MEMORY=1G
- SPARK_WORKER_CORES=1
depends_on:
- spark
volumes:
- 'C:/docker-proyecto/spark-data:/opt/spark'
I’m running the following spark-submit
command:
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.3 /opt/spark/script.py
In the script I’m trying to consume data from kafka
kafka_servers = "localhost:9092"
stream_df = spark.readStream
.format("kafka")
.option("kafka.bootstrap.servers", kafka_servers)
.option("subscribe", "topic1,topic2")
.option("startingOffsets", "earliest")
.load()
24/12/13 23:28:02 WARN NetworkClient: [AdminClient clientId=adminclient-1] Connection to node 1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
The idea is that Spark can connect to Kafka without any problem, to read the data from the topics and manipulate them to save them in an HDFS, but it can’t even connect and I don’t know why. I’ve tried everything that my little experience in Docker has allowed me to.
Leynder Sánchez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.