I’m trying to calculate a hash of some columns in Azure Data Factory. I’ve seen advice in other questions to use functions like sha2(256,columns()))
. This seems safe at first glance, but what I’ve found is that these two expressions generate the same hash:
sha2(256, "ABC", "DEF")
sha2(256, "AB", "CDEF")
So it seems like the hash functions, when passed multiple parameters, just concatenate them all together before generating the hash. As the source data above isn’t the same but generates the same hash, this doesn’t seem to do what you’d expect.
Is there a best practice for generating a different hash from different data like the “ABC”, “DEF” and “AB”, “CDEF” above, especially when it comes to a dynamically-generated list of columns? Or am I missing something?