I have the following table:
UserID1 | Amount1 | UserID2 | Amount2 | ….. | UserIDN | AmountN |
---|
Each row in the table contains unique user IDs with the corresponding amounts. However, the distribution of these unique user IDs across columns is different for different rows (i.e. the same user ID can appear in different columns for different rows), like this:
UserID1 | Amount1 | UserID2 | Amount2 | ….. | UserIDN | AmountN |
---|---|---|---|---|---|---|
1 | 10 | 2 | 15 | ….. | 125 | 30 |
2 | 26 | 5 | 17 | ….. | 3 | 26 |
125 | 17 | 3 | 22 | ….. | 1 | 20 |
…..
I need to sum the Amount values for each identical UserID from all rows and dump these results into the following table:
UserID | Amount |
---|
in which each row will contain a unique UserID from the original table and the sum of this UserID’s Amounts from all the rows in the original table. For the example above the resulting table will look like:
UserID | Amount |
---|---|
1 | 30 |
2 | 41 |
3 | 48 |
125 | 47 |
5 | 17 |
What would be the best way to do it?
Thanks in advance.
11
Please try the following solution based on SQL Server’s powerful XML and XQuery functionality.
It is generic regardless how many column pairs are across in the table. No need in any dynamic SQL.
The entire magic is based on the mod
operator.
Notable points:
- 1st
CROSS APPLY
is converting all columns into XML data type. - 2nd
CROSS APPLY
is re-shaping XML into a proper format based on XML attributes. - 3rd
CROSS APPLY
is shredding XML back into relational/rectangular data. - Post CTE
SELECT
is grouping and summing data.
To feel and see it, here is the first row x value has the following XML:
<root>
<UserID1>1</UserID1>
<Amount1>10</Amount1>
<UserID2>2</UserID2>
<Amount2>15</Amount2>
<UserID3>125</UserID3>
<Amount3>30</Amount3>
</root>
and the y has the following value:
<root>
<r id="1" amount="10" />
<r id="2" amount="15" />
<r id="125" amount="30" />
</root>
The same for each row.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (UserID1 INT, Amount1 INT, UserID2 INT, Amount2 INT, UserID3 INT, Amount3 INT);
INSERT INTO @tbl (UserID1, Amount1, UserID2, Amount2, UserID3, Amount3) VALUES
(1 , 10, 2, 15, 125, 30),
(2 , 26, 5, 17, 3 , 26),
(125, 17, 3, 22, 1 , 20);
-- DDL and sample data population, end
WITH rs AS
(
SELECT z.value('@id', 'INT') AS ID
, z.value('@amount', 'INT') AS Amount
FROM @tbl AS t
CROSS APPLY (SELECT t.* FOR XML PATH(''), TYPE, ROOT('root')) AS t1(x)
CROSS APPLY (SELECT x.query('<root>
{
for $x in /root/*[position() mod 2 = 1]
let $pos := count(/root/*[. << $x[1]]) + 1
return <r id="{$x/text()}" amount="{/root/*[$pos + 1]/text()}"></r>
}
</root>')) AS t2(y)
CROSS APPLY y.nodes('/root/r') AS t3(z)
)
SELECT ID, SUM(Amount) AS Amount
FROM rs
GROUP BY ID
ORDER BY ID;
Output
ID | Amount |
---|---|
1 | 30 |
2 | 41 |
3 | 48 |
5 | 17 |
125 | 47 |
1
If you are doing this as a one-off, you can use dynamic SQL:
declare @i integer=2, @sql as nvarchar(max)='select userId1 as UserId, Amount1 as Amount from YrTbl';
while @i <= 50 begin
set @sql = @sql + ' union all' + char(10) +
N'select userId'
+ cast(@i as nvarchar(10)) + N', '
+ 'Amount'
+ cast(@i as nvarchar(10)) + N' from YrTbl'
set @i=@i+1
end;
set @sql=N'select UserId, sum(Amount) as Amount into #YrNewTable from (' + @sql + ') S1 group by UserId'
exec sp_executesql @sql
--print @sql
For production work careful consideration must be paid to security/performance aspects of dynamic SQL.
Alternative to looping:
declare @sql varchar(max);
set @sql=(SELECT string_agg('select ' + Column_name + ' as UserId, Amount' + substring(column_Name, 7,10) + ' as Amount from YrTbl union all', char(10))
FROM tempdb.INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_NAME LIKE '#YrTbl%'
and Column_Name like 'UserId%')
set @sql='select UserId, sum(Amount) as Amount) from ('
+ lefT(@sql,len(@sql)-9)
+ ') S1 group by UserId'
exec sp_executesql@sql
4
One of the options to do it is using UNPIVOT to get your UserID-Amount pairs into rows and then to do aggregation:
WITH -- S a m p l e D a t a :
tbl (UserID1, Amount1, UserID2, Amount2, UserID3, Amount3) AS
( Select 1, 10, 2, 15, 125, 30 Union All
Select 2, 26, 5, 17, 3, 26 Union All
Select 125, 17, 3, 22, 1, 20
),
… create cte (pairs) to transpose columns to rows using UNPIVOT
pairs as
( Select Case When UserID Is Null
Then LAG(UserID) Over(Partition By PAIR Order By PAIR)
Else UserID
End as UserID,
Amount
From ( Select COLUMN_PAIR, Right(COLUMN_PAIR, 1) as PAIR,
Case When Left(COLUMN_PAIR, 1) = 'U' Then Amount End as UserID,
Case When Left(COLUMN_PAIR, 1) = 'A' Then Amount Else 0 End as Amount
From ( Select *
From ( Select * From tbl ) t
UNPIVOT ( Amount
FOR COLUMN_PAIR IN( UserID1, Amount1,
UserID2, Amount2,
UserID3, Amount3 )
) u
) u
) u
)
… aggregate (Sum) the amounts and group by UserID
-- M a i n S Q L :
Select UserID, Sum(Amount) as Amount
From pairs
Group By UserID
Order By UserID
/*
UserID Amount
------ ------
1 30
2 41
3 48
5 17
125 47 */
See the fiddle here.
2