We use kafka reader with spark 3.4.3 to read data from topic. we consume two topics k8s-v2.in.mongo.mongo-conn.ninja.xyz.PolicyDetail
and k8s-v2.in.mongo.mongo-conn.ninja.xyz.PolicyDetailHistory
The code looks like this
spark.readStream.format("kafka")
.option("kafka.bootstrap.servers", "host:9092")
.option("subscribePattern", "{regex}")
.option("startingOffsets", "earliest")
.option("failOnDataLoss", "false")
.option("maxOffsetsPerTrigger", 1000000)
.option("group.id", "group1")
.load()
When we put k8s-v2.in.mongo.mongo-conn.ninja.xyz.PolicyDetail
in regex, it consumes all topics from kafka.
When we put ^(k8s-v2.in.mongo.mongo-conn.ninja.xyz.PolicyDetail)$
it consumes both PolicyDetail and PolicyDetailHistory. But I verified this regex on online regex site, it works as expected. Not sure why its not working with spark
Thanks in advance
1