I have a seemingly simple query that takes about 7 minutes to execute in a local SQLEXPRESS and about 40 minutes on Azure in a production SQL server.
The source table has about 1 million rows, the target has 10 million and is growing slowly. Unfortunately, the source table has duplicates and that’s why GROUP BY was added.
The query execution plan shows that it has a few heavy Sort operators that cause spilling to tempdb even locally, so I imagine it would be spilling on Azure even more.
As there are so many columns involved in grouping and the merge condition, I’m afraid that there’s no use of indexes here, I would have to index almost all the columns.
What would be a better solution to achieve the same results of inserting unique records from the source table into the target table when they differ and only updating the target timestamp when both sides match?
MERGE TargetTable AS target
USING (SELECT
[Name]
,[CategoryName]
,[Field01]
,[Field02]
,[Field03]
,[Field04]
,[Field05]
,[Field06]
,[Field07]
,[Field08]
,[Field09]
,[Field10]
,[Value]
,[StartDate]
,[EndDate]
FROM SourceTable
GROUP BY
[Name]
,[CategoryName]
,[Field01]
,[Field02]
,[Field03]
,[Field04]
,[Field05]
,[Field06]
,[Field07]
,[Field08]
,[Field09]
,[Field10]
,[Value]
,[StartDate]
,[EndDate]
)
AS source
(
[Name]
,[CategoryName]
,[Field01]
,[Field02]
,[Field03]
,[Field04]
,[Field05]
,[Field06]
,[Field07]
,[Field08]
,[Field09]
,[Field10]
,[Value]
,[StartDate]
,[EndDate]
)
ON
(
target.[Name] = source.[Name] and
target.[CategoryName] = source.[CategoryName] and
target.[Field01] = source.[Field01] and
target.[Field02] = source.[Field02] and
target.[Field03] = source.[Field03] and
UPPER(target.[Field04]) = source.[Field04] COLLATE SQL_Latin1_General_CP1_CS_AS and
UPPER(target.[Field05]) = source.[Field05] COLLATE SQL_Latin1_General_CP1_CS_AS and
UPPER(target.[Field06]) = source.[Field06] COLLATE SQL_Latin1_General_CP1_CS_AS and
target.[Value] = source.[Value] and
target.[StartDate] = source.[StartDate] and
target.[EndDate] = source.[EndDate]
)
WHEN MATCHED THEN
UPDATE SET [UpdatedDateTime] = GETUTCDATE()
WHEN NOT MATCHED THEN
INSERT (
[Name]
,[CategoryName]
,[Field01]
,[Field02]
,[Field03]
,[Field04]
,[Field05]
,[Field06]
,[Field07]
,[Field08]
,[Field09]
,[Field10]
,[Value]
,[StartDate]
,[EndDate]
,[UpdatedDateTime]
)
VALUES
(
source.[Name]
,source.[CategoryName]
,source.[Field01]
,source.[Field02]
,source.[Field03]
,source.[Field04]
,source.[Field05]
,source.[Field06]
,source.[Field07]
,source.[Field08]
,source.[Field09]
,source.[Field10]
,source.[Value]
,source.[StartDate]
,source.[EndDate]
, GETUTCDATE()
);