I would like to remove duplicates when aggregating results using string_agg in SQL server. I have tried a few methods in SO but they don’t seem to work
this is the structure of my table,
https://sqlfiddle.com/sql-server/online-compiler?id=65d04783-2b98-4e3d-acb1-b045d118f18a
SubscriptionId | Number | BusinessOwner | ResourceGroupName | ITApplicationOwner | Parent |
---|---|---|---|---|---|
a1b2c3d4-1234-5678-9abc-123456789abc | AP000000001 | John Doe | rg-xyz-abc-123-usw-01 | Jane Smith | Global Tech Corp (TECH) |
a1b2c3d4-1234-5678-9abc-123456789abc | AP000000002 | John Doe | rg-xyz-abc-123-usw-01 | Alan Brown | Global Tech Corp (TECH) |
a1b2c3d4-1234-5678-9abc-123456789abc | AP000000003 | Maria Johnson | rg-xyz-abc-123-usw-01 | David Clark | Global Tech Corp (TECH) |
and I want it to output as follows,
SubscriptionId | Number | BusinessOwner | ResourceGroupName | ITApplicationOwner | Parent |
---|---|---|---|---|---|
a1b2c3d4-1234-5678-9abc-123456789abc | AP000000001,AP000000002,AP000000003 | John Doe, Maria Johnson | rg-xyz-abc-123-usw-01 | Jane Smith, Alan Brown, David Clark | Global Tech Corp (TECH) |
This is my current query in SQL which adds the values in the required format but does not remove duplicates,
WITH CombinedData AS (
SELECT
tmo.SubscriptionId,
tmo.ResourceGroupName,
-- Removing duplicates in the Numbers field
(SELECT STRING_AGG( tmo2.Number, ',')
FROM [TRFM_multipleappid-owners] tmo2
WHERE tmo2.SubscriptionId = tmo.SubscriptionId
AND tmo2.ResourceGroupName = tmo.ResourceGroupName) AS Numbers,
-- Removing duplicates in the BusinessOwners field
(SELECT STRING_AGG( tmo2.BusinessOwner, ',')
FROM [TRFM_multipleappid-owners] tmo2
WHERE tmo2.SubscriptionId = tmo.SubscriptionId
AND tmo2.ResourceGroupName = tmo.ResourceGroupName) AS BusinessOwners,
-- Removing duplicates in the ITApplicationOwner field
(SELECT STRING_AGG( tmo2.ITApplicationOwner, ',')
FROM [TRFM_multipleappid-owners] tmo2
WHERE tmo2.SubscriptionId = tmo.SubscriptionId
AND tmo2.ResourceGroupName = tmo.ResourceGroupName) AS ITApplicationOwners,
-- Removing duplicates in the Parents field
(SELECT STRING_AGG( tmo2.parent, ',')
FROM [TRFM_multipleappid-owners] tmo2
WHERE tmo2.SubscriptionId = tmo.SubscriptionId
AND tmo2.ResourceGroupName = tmo.ResourceGroupName) AS Parents
FROM
[TRFM_multipleappid-owners] tmo
GROUP BY
tmo.SubscriptionId,
tmo.ResourceGroupName
)
SELECT
CombinedData.SubscriptionId,
CombinedData.Numbers,
CombinedData.BusinessOwners,
CombinedData.ResourceGroupName,
CombinedData.ITApplicationOwners,
CombinedData.Parents
FROM
CombinedData;
how do I go about removing duplicates in the fields ?
5