The following query will take over one hour to execute:
SELECT DISTINCT
t.TradeAmount
FROM crm.NewTransaction t
LEFT OUTER JOIN drd.PriceHistoric PPH
ON t.TradeDate = PPH.PriceDate
When I add the JOIN field t.TradeDate to the fields to be selected, the query runs in less than a second.
SELECT DISTINCT
t.TradeAmount
,t.TradeDate
FROM crm.NewTransaction t
LEFT OUTER JOIN drd.PMSPriceHistoric PPH
ON t.TradeDate = PPH.PriceDate
There must me something fundamental about SQL queries that is causing this.
Any thoughts on why just adding one of the fields (t.TradeDate) used to join the tables makes the query execute without any issues
I am executing the code on Azure Databricks
Recognized by Microsoft Azure Collective
2