I have written a program to solve a special puzzle, but now I’m kind of stuck at the following problem:
I have about 3200 points/nodes/dots. Each of these points is connected to a few other points (usually 2-5, theoretical limit is 1-26). I have exactly one starting point and about 30 exit points (probably all of the exit points are connected to each other). Many of these 3200 points are probably not connected to neither start nor end point in any way, like a separate net, but all points are connected to at least one other point.
I need to find the shortest number of hops to go from entry to exit. There is no distance between the points (unlike the road or train routing problem), just the number of hops counts. I need to find all solutions with the shortest number of hops, and not just one solution, but all. And potentially also solutions with one more hop etc.
I expect to have a solution with about 30-50 hops to go from start to exit.
I already tried:
1) randomly trying possibilities and just starting over when the count was bigger than a previous solution. I got first solution with 3500 hops, then it got down to about 97 after some minutes, but looking at the solutions I saw problems like unnecessary loops and stuff, so I tried to optimize a bit (like not going back where it came from etc.). More optimizations are possible, but this random thing doesn’t find all best solutions or takes too long.
2) Recursively run through all ways from start (chess-problem-like) and breaking the try when it reached a previous point. This was looping at about a length of 120 nodes, so it tries chains that are (probably) by far too long. If we calculate 4 possibilities and 120 nodes, we’re reaching 1.7E72 possibilities, which is not possible to calculate through. This is called Depth-first search (DFS) as I found out in the meantime. Maybe I should try Breadth-first search by adding some queue?
The connections between the points are actually moves you can make in the game and the points are how the game looks like after you made the move.
What would be the algorithm to use for this problem? I’m using C#.NET, but the language shouldn’t matter.
4
generally this is called path finding
You seemed to have tried 2 depth first searches.
I suggest using a breadth first search (keep all paths of length n and then use those to find all paths of length n+1).
If this exceeds your memory then instead you can use a iterative deepening approach. This is a depth first that you break off at length n and if you don’t find a solution then try again with n' = n+k
and repeat
I recommend a depth-first search like you already tried, but put some additional constraints on it. For example, a path that loops (revisits nodes) is clearly undesirable when looking for the shortest path: simply remove that loops, and you found a path shorter by N where N is the number of nodes in the loop.
With any algorithm it makes sense to set up the preconditions, including the problem to be solved: the steps of the algorithm: and the exit condition (success or failure).
The Problem
Consider a graph consisting of an arbitrary number of nodes. One node is the start node and there are multiple exit nodes. The goal is to find all paths of the same length that traverse from the start node to one of the exit nodes and that are the shortest length.
Each traversal of the graph will necessarily maintain a list of all the nodes traversed from start to the current node. This defines the path.
There will be storage to contain the completed paths which contains zero or more paths representing traversal from the start node to an exit node.
Algorithm
- Current node is the start node.
- Pick one edge not yet traversed: that is not part of the current path; that was not traversed and backtracked; that does not connect to the current node (does not loop).
- If there are no edges to traverse and the current node is the start node, the algorithm terminates.
- If there are no edges to traverse and the current node is not the start node, backtrack one node and repeat this step.
- Traverse the edge to the next node and add it to the path.
- If the current node is an exit node: copy the path to the completed paths and backtrack to the previous node.
- Go to step 2.
Notes
This is a basic depth-first algorithm that will be relentless in finding all of the paths. This is necessary in order to find the shortest one: the last path traversed could be the shortest, we don’t know.
This algorithm:
- Will find all paths.
- Will not traverse the same edge twice in one path. That is, will not loop.
- Does not care about nodes or systems of nodes that are not reachable from the start node. Such nodes will take up memory, but will not matter to the running time of the algorithm.
- Might be able to be improved on. Specifically, it might be possible to run a different algorithm to prune nodes and edges ahead of time which might help reduce the number of nodes in the graph. Perhaps there is a chain of nodes a thousand long that ends in a non-exit node, for example. Sometimes the time and effort to do this helps, sometimes not.