Below is my SQL query that I run in databricks SQL editor:
SELECT
orders.GroceryStore,
TO_JSON(COLLECT_LIST(MAP('CustomerID', orders.CustomerID,'DiscountValue', orders.DiscountValue,'SalesAmount', orders.SalesAmount,'ChargesInfo', nested_json.ChargeDetails))) AS JsonLine
FROM (
SELECT h.GroceryStore, d.CustomerID, SUM(d.DiscountValue) AS DiscountValue, SUM(d.SalesAmount) AS SalesAmount
FROM SalesHeader h
LEFT JOIN SalesDetail d ON h.CustomerID = d.CustomerID
WHERE h.Date >= '2024-01-01'
GROUP BY h.GroceryStore, d.CustomerID
) AS orders
LEFT JOIN (
SELECT
d.CustomerID,
TO_JSON(COLLECT_LIST(MAP(
'ChargeType', d.ChargeType,
'ChargeAmount', d.ChargeAmount,
'PaidAmount', d.PaidAmount
))) AS ChargeDetails
FROM ChargesDetail d
GROUP BY d.CustomerID
) AS nested_json ON orders.CustomerID = nested_json.CustomerID
GROUP BY orders.GroceryStore;
When I run the query, this is the output that I got:
Based on the output, there are escape character for the fields in
ChargeInfo
. Is there anyway that I can modify my SQL query so that the output does not contain escape character? The desire output is:
[{"CustomerID":"0001ABC","DiscountValue":"126.33","SalesAmount":"2320.26","ChargesInfo":[{"ChargeType":"01","ChargeAmount":"20.26","PaidAmount":"11.22"}]}]
Take note that there is no ""
for the array of ChargeInfo
in the desire output as well.
Any help or advise will be greatly appreciated!