I need to query Azure Databricks delta lake data using C# .NET
Databricks is now supporting a method known as Statement Execution API.
I have put together a simple console app, which suppose to query Databricks via this API:
using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
namespace DatabricksApiApp
{
class Program
{
static readonly HttpClient client = new HttpClient();
static async Task Main(string[] args)
{
try
{
// Set up the endpoint and the access token
string baseUrl = "https://adb-7786225719779619.19.azuredatabricks.net/api/2.0/sql/statements";
string accessToken = "dapierd7a4665da6d0e05a1ec650b85570b70";
string warehouseId = "ff7b6377ff77ab94";
// Prepare the HTTP request
HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Post, baseUrl)
{
Content = new StringContent(
"{"statement":"SELECT * FROM default.product_1", "warehouse_id":"{warehouseId}", "timeout_seconds":600}",
Encoding.UTF8,
"application/json"
)
};
request.Headers.Add("Authorization", $"Bearer {accessToken}");
// Send the request
HttpResponseMessage response = await client.SendAsync(request);
// Check the response
if (response.IsSuccessStatusCode)
{
string responseBody = await response.Content.ReadAsStringAsync();
Console.WriteLine("Data retrieved successfully!");
Console.WriteLine(responseBody);
}
else
{
Console.WriteLine($"Failed to retrieve data. Status code: {response.StatusCode}");
string responseBody = await response.Content.ReadAsStringAsync();
Console.WriteLine($"Response Body: {responseBody}");
}
}
catch (HttpRequestException e)
{
Console.WriteLine("nException Caught!");
Console.WriteLine("Message :{0} ", e.Message);
}
}
}
}
When I run this code, I receive the following error:
Failed to retrieve data. Status code: BadRequest
Response Body: {"error_code":"INVALID_PARAMETER_VALUE","message":"{warehouseId} is not a valid endpoint id."}
As I navigate over to Azure Databricks workspace, select “SQL Warehouses”, select the one warehouse I am interested in, select “Connection details” tab – I see the the following entry under the “HTTP Path” field:
/sql/1.0/warehouses/ff7b6377ff77ab94
Naturally, I am making an assumption that “ff7b6377ff77ab94” is the warehouse id, which is required to make use of the Statement Execution API.
Am I doing something wrong or incomplete?