I want to connect MongoDB Atlas with Spark inside Microsoft Fabric Notebook. Here is my pyspark code base.
from pyspark.sql import SparkSession
mongo_uri = "mongodb+srv://developer:<password>@cluster1.hju3l.mongodb.net/?retryWrites=true&w=majority&appName=Cluster1"
my_spark = SparkSession
.builder
.appName("myApp")
.config("spark.mongodb.read.connection.uri", mongo_uri)
.config("spark.mongodb.write.connection.uri", mongo_uri)
.config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.12:10.2.0")
.getOrCreate()
df = spark.read.format("mongodb").option("database", "lead").option("collection",
"users").load()
df.printSchema()
But when i try to run above code it throwing an below error
Py4JJavaError: An error occurred while calling o6558.load.
: org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find the data source: mongodb. Please find packages at `https://spark.apache.org/third-party-projects.html`.
After searching the cause of this issue it showing that the mongo-spark-connector jar file not found but i have upload the jar file in library section of Microsoft fabric(custom library section) and also installed mongoengine inside public library section.
I have also upload same jar file(mongo-spark-connector_2.12-10.2.0.jar) inside notebook spark environment also. Below is the screenshort.