I want to implement the A* algorithm. I have a network with Nodes and Edges, both are classes.
Now, I am unsure how to approach the connectivity information. Should a node know which edges depart from it? Should an edge know which two nodes it connects?
I thought both answers were true, but then there are circular dependencies, i.e. when copying a node, it copies all the edges it has, which, in turn, copy all the nodes they connect, and so on.
Maybe it is better to save all connectivity problems in the Network class? But then I have to explicitly create them and have to update them manually if I delete or add nodes/edges.
What is a good way to approach this?
Graphs
The dependencies of a graph are:
graph <- {vertices, edges}
edge <- {u, v} // u and v are vertices
vertex <- {Label}
The Label of a vertex may be implicit in from the implementation’s data structure, e.g. it’s index within an array. Vertices are dependent on labels independently of edges because a single vertex is still a graph [or a tree, or a network] in its own right.
Business Logic
In an implementation, the vertex label might used to retrieve a record. What is stored in the record is a matter of business logic, not a property of graphs. Therefore the choice of storing edges at the vertex is subject to engineering analysis. But all graph operations can be implemented without doing so.
As more detailed information is stored at the vertices, the less general the implementation becomes. Storing edges at the node is essentially caching and comes with all the issues normally associated with any other cache process, e.g. routers and switches spend a lot of their energy implementing business logic to cope with storing edges at nodes.
Tradeoffs
The more mathematical the implementation, the easier it is to reason about. The more business like, the higher the ceiling for ultimate performance is in theory. However, theoretical ceilings on performance are based on getting the business logic right and that’s harder than getting the mathematics right.
Unless your edges have attributes (which you didn’t state) the simplest way is to not add edge A → B if A → B already exists. This is how decentralized flooding mechanisms (e.g.
USENET) keep from forming cycles.
As far as what objects should know what they are connected to, you’ve got a space/time trade-off: you can have nodes know edges or edges know nodes or both. To do useful stuff with graphs, both is probably the best otherwise you’ll spend a lot of time looking for elements.
3