I am looking for a solution on how to handle rows that should be placed in a SCD2 table. In DLT (Delta Live Tables) this is easily solved by using apply_changes with sequence_by and primary key parameters set. Now I am looking for a solution (either in code or a general idea) in pure T-SQL.
SCD2 is quite easy to handle when the source row always arrive in the right order. When it does not, and we add on the complexity of multiple source rows in one batch, things get hard.
For SCD2 I am currently using merge which is perfect for comparing two rows, one for source and one for target.
One solution would be to dedupe source and let the latest row take precedence – ultimately “loosing” some history. But I would like to be able to have, say, 5 rows with the same business key in source, arriving out of order, and to be able to load them in with the correct sequencing on effective dates.
How would you approach this? What does Kimball teach us on this matter?