I was trying to query a ms sql server with spark but i cannot get it to display the resulting dataframe as spark is sending a literal “LIMIT” to the sql server resulting in a SQLServerException.
The following query runs absolutly fine
df = spark.read.format('sqlserver')
.option('host', sql_server_host)
.option('port', sql_server_port)
.option('database', sql_server_database)
.option('user', sql_server_user)
.option('password', sql_server_password)
.option("dbtable", 'dbo.TestTable')
.load()
but when running
df.display() or display(df)
I get the following error: SQLServerException: Incorrect syntax near ‘10001’.
Upon futher inspection I can see that spark is sending the following query to the sql server which obviouly fails because sql server does not support LIMIT.
SELECT TestColumn FROM dbo.TestTable LIMIT 10001
Is there anything wrong with my configuration and is there any workaround?
1