I am creating a cleansing view for a ledger table where transactions are stored with hierarchical structure in the source system. I want to create an additional colum ‘Group_Id’ identifying all the records which belong to the same transaction set.
My goal is to find the most efficient way to populate the ‘Group_Id’ column like in the following example:
Entry_No | Closed_By_Entry_No | Group_Id |
---|---|---|
1 | 4 | 1 |
2 | 4 | 1 |
3 | 4 | 1 |
4 | 5 | 1 |
5 | 0 | 1 |
6 | 9 | 2 |
7 | 9 | 2 |
8 | 9 | 2 |
9 | 11 | 2 |
10 | 11 | 2 |
11 | 0 | 2 |
12 | 14 | 3 |
13 | 14 | 3 |
14 | 0 | 3 |
15 | 16 | 4 |
16 | 0 | 4 |
17 | 0 | 5 |
18 | 20 | 6 |
19 | 20 | 6 |
20 | 22 | 6 |
21 | 22 | 6 |
22 | 0 | 6 |
Where Closed_By_Entry_No is the Parent_Id and Entry_No is the Child_Id and the Parent_Id of the top-most Child_Id of each set is always 0 (zero).
The challenge for me is that I don’t have one hierachy in my dataset, I have hundreds of thousand of them. Tipically each set is made of two records: 1 invoice + 1 payments (70% of the records in total), but then there are more elaborated situations like in the above example where I can have e.g. 15 invoices + 3 payments + 1 credit note. And I need to identify all of those sets in a way I can treat them as a single subject for further calculations.
The only idea I have now to solve this is feeding a cursor with only
SELECT Entry_No, Closed_By_Entry_No FROM table WHERE Closed_By_Entry_No = 0
scroll this cursor through the resultset sending each candidate record to a stored procedure containing the recursive CTE to read the hierarchy from the original table and populating the Group_Id incrementally from the cursor index itself and storing the result in a temp table. Then continue from that temp table ….
There must be a simpler method that doesn’t use either cursors or temporary tables.
Thanks for any hint.
maclura is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.