I’m a complete beginner when using SQL. I have a scenario where I need to delete the duplicates if it does not meet a condition below.
IF a duplicate is found, check if the subscriber key starts with 003 or 001 or 00Q;
if true, then keep all duplicates;
if any of the subscriberkey does not meet the condition, then delete all duplicates.
Example table:
| emailAddress | subscriberKey|
| ------------ | -------------|
| [email protected] | 0035656 |
| [email protected] | 0015623 |
| [email protected] | sdfsdfd |
| [email protected] | 0018956 |
| [email protected] | 00Q4522 |
result should be:
| emailAddress | subscriberKey |
| -------- | -------- |
| [email protected] | 0018956 |
| [email protected] | 00Q4522 |
I’m having a hard time visualizing how to approach this. I thought about grouping them first to check if there’s a duplicate then I’m stumped on how to check for the subscriber key.
DELETE EmailAddress, SubscriberID, Status, SubscriberKey
FROM [getting duplicates] t1
WHERE EXISTS(
SELECT 1
FROM [getting duplicates] As t2
WHERE t1.EmailAddress = t2.EmailAddress
AND (t2.SubscriberKey NOT Like '003%' OR t2.SubscriberKey NOT Like '001%' OR t2.SubscriberKey NOT Like '00Q%' ) OR t2.Status = 'unsubscribed'
HAVING Count(*) > 1
)
Any thoughts or suggestions that can point me to the right direction will be very helpful.
Thank you very much.
Sigpanda is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.