I have a database with 2 tables NODE and EDGE, where NODE consists of two columns ID (int) and NAME (varchar)
and EDGE consists of two columns PARENTELEMENTID (int) and CHILDELEMENTID(int).
With the help of these two tables i’m now able to emulate a graph structure inside my relational database. For Example:
CREATE TABLE NODE(ID int, NAME varchar(1));
CREATE TABLE EDGE(PARENTELEMENTID int, CHILDELEMENTID int);
INSERT INTO NODE (Id, Name) VALUES(1, 'A');
INSERT INTO NODE (Id, Name) VALUES(2, 'B');
INSERT INTO NODE (Id, Name) VALUES(3, 'C');
INSERT INTO NODE (Id, Name) VALUES(4, 'D');
INSERT INTO NODE (Id, Name) VALUES(5, 'E');
INSERT INTO NODE (Id, Name) VALUES(6, 'F');
INSERT INTO NODE (Id, Name) VALUES(7, 'G');
INSERT INTO NODE (Id, Name) VALUES(8, 'H');
INSERT INTO NODE (Id, Name) VALUES(9, 'I');
INSERT INTO NODE (Id, Name) VALUES(10, 'J');
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (1, 4);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (1, 5);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (2, 4);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (2, 6);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (3, 7);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (4, 7);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (5, 8);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (6, 9);
INSERT INTO EDGE (ParentElementId, ChildElementId) VALUES (10, 9);
That result in the following Graph:
What i need is a statement to retrieve all nodes in this graph, starting at any node.
My first attempt was the following statement:
WITH ElementsToGetParentsFrom as (
select n.Id
from NODE n
where n.Id in (1)
),
ParentHierachies AS (
SELECT p.Id,
0 as traceLevel,
NEWID() as TraceId
from ElementsToGetParentsFrom p
UNION ALL
SELECT
e.ParentElementId as id,
ph.traceLevel + 1,
ph.TraceId as TraceId
FROM EDGE e
JOIN ParentHierachies ph ON e.ChildElementId = ph.id
),
GetRootIds as (
select ph.Id,
dense_rank() over (partition by TraceId order by traceLevel desc) as ranking
from ParentHierachies ph
),
AllElementsInGraph as (
select gri.Id as Id
from GetRootIds gri
where gri.ranking = 1
union all
select e.ChildElementId as Id
from EDGE e
join AllElementsInGraph aeit on aeig.Id = e.ParentElementId
)
SELECT N.*
FROM AllElementsInGraph aeig
INNER JOIN NODE N on aeig.Id = N.Id
I’m using the first cte ‘ElementsToGetParentsFrom’ as an entrance for my statement to decide of which nodes i want to retrieve the graphs.
The second Cte ‘ParentHierachies’ traverses the graph(s) upwards from the given node(s), giving them a level (used for next cte) and a trace-id, to be able to differ between the different root elements.
Now the third cte ‘GetRootIds’ uses the traceid and tracelevel to reduce the result to only root elements of the graphs by partitioning all id’s by their traceid and ranking them by their level in the graph.
In the last cte ‘AllElementsInGraph’ I’m traversing down from the RootElements to get all Nodes in the graph.
The statement works only if my graph is more like a tree (i think there is my problem, since while i’m writing this, i’m realising i’m mixing up terms for trees and graphs).
So for my example graph i’m getting the following results for this input:
A -> A,D,E,G,H
B -> B,E,F,H,I
C -> C,G
D -> A,D,E,G,H
E -> A,B,D,E,F,G,H,I
F -> B,E,F,H,I
G -> A,C,D,E,G,H
H -> A,B,D,E,F,G,H,I
I -> B,E,F,H,I,J
J -> I,J
What i need is a statement (if it is even possible). To retrieve all nodes, given any node(s).
A -> A,B,C,D,E,F,G,H,I,J
B -> A,B,C,D,E,F,G,H,I,J
C -> A,B,C,D,E,F,G,H,I,J
D -> A,B,C,D,E,F,G,H,I,J
E -> A,B,C,D,E,F,G,H,I,J
F -> A,B,C,D,E,F,G,H,I,J
G -> A,B,C,D,E,F,G,H,I,J
H -> A,B,C,D,E,F,G,H,I,J
I -> A,B,C,D,E,F,G,H,I,J
J -> A,B,C,D,E,F,G,H,I,J