I’ve created a Share on my Databricks workspace which I’m trying to consume from on my local workstation using pyspark 3.5.3 but it’s raising a UnexpectedHttpStatus
exception and seems to be saying that I need to provide reference to a form of compute on the workspace.
import delta_sharing
from pyspark.sql import SparkSession
config_path = "/path/to/config.share"
client = delta_sharing.SharingClient(config_path)
dependencies = [
"io.delta:delta-core_2.12:2.3.0",
"io.delta:delta-sharing-spark_2.12:3.2.1"
]
spark = (SparkSession
.builder
.master("local")
.config("spark.jars.packages", ",".join(dependencies))
.config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
.getOrCreate()
)
table_url = "path/to/config.share#share_name.schema_name.table_name"
spark.read.format("deltaSharing").load(table_url).limit(10).show()
24/12/09 10:55:17 ERROR DeltaSharingRestClient: DeltaSharingRestClient with enableAsyncQuery false
24/12/09 10:55:18 ERROR DeltaSharingRestClient: DeltaSharingRestClient with enableAsyncQuery false
24/12/09 10:55:22 ERROR RetryUtils: Error during retry attempt 1, retryDuration=2459, totalDuration=2459 : HTTP request failed with status: HTTP/1.1 400 Bad Request {"error_code":"INVALID_PARAMETER_VALUE","message":"One of job_cluster_key, new_cluster, or existing_cluster_id must be specified. Serverless compute for workflows is not enabled in the workspace.","details":[{"@type":"type.googleapis.com/google.rpc.RequestInfo","request_id":"xxxxx","serving_data":""}]}.
io.delta.sharing.client.util.UnexpectedHttpStatus: HTTP request failed with status: HTTP/1.1 400 Bad Request {"error_code":"INVALID_PARAMETER_VALUE","message":"One of job_cluster_key, new_cluster, or existing_cluster_id must be specified. Serverless compute for workflows is not enabled in the workspace.","details":[{"@type":"type.googleapis.com/google.rpc.RequestInfo","request_id":"xxxxx","serving_data":""}]}.
at io.delta.sharing.client.DeltaSharingRestClient.$anonfun$getResponse$1(DeltaSharingClient.scala:1032)
at io.delta.sharing.client.util.RetryUtils$.runWithExponentialBackoff(RetryUtils.scala:40)