I have a huge dynamic source table that is continually getting populated with new data. In it there’s an ID column whose value can appear over multiple rows over time in the source dynamic table.
Is there a way to scan the source dynamic table, taking only the first row for each ID found, and then writing that to the destination only once? The destination table would be initially empty, but I’m hoping that there’s some way to have new rows in the source to be checked if they exist already in the destination based on the ID, and only write them to the destination if that ID is not already in the source?
What I’m struggling with is that if the source is huge, how can I avoid using something like row number/partitioning on the ID every single time – the cost must be excessive and impractical to run constantly?
Any advice would be very much appreciated, maybe someone else has come across this use case already? Thank you very much!