I’m working on a project in C where I need to find all the solutions to the n-queens problem. It is a really interesting and challenging problem, which basically consists of trying to find all the possible placements in a nxn chessboard of an exact n number of queens, without threatening in between each other. It is known that there is an exact 234.907.967.154.122.528 number of solutions for n=27, the largest number discovered yet.
There are a lot of different algorithms to solve this. One example of the most popular and efficient one is known as backtracking.
However, although it is not nearly as efficient as the backtracking, I want to find all the solutions for a determinated n using the following algorithm:
Firstly, I generate a totally random vector of dim n. If that vector is encoded to not be one of the solutions to the problem, I assign it a hash. Then, we run into a for-loop which basically compares all the solutions to this particular hash. We proceed with adding all the solutions to a counter and finally we print that counter with the number of solutions.
Note: the solutions encoding into vectors can be expressed as the following image, for example.
In a first instance, I designed a randomness algorithm that it just took too long to be efficient. However, I came across an idea: assigning a hash to the vectors and then comparing them with their “hashes”. The program just went up in terms of efficiency: what were minutes then turned into mere milliseconds to calculate.
The code of this algorithm is the following:
Firstly, we have the all_solutions
function, which sums up the total number of solutions by the previous description I detailed. The is_solution
function just checks what it’s name indicates, throughout a mathematical formula I figured out earlier to start coding.
void all_solutions(unsigned int n) {
unsigned long solutions[MAX_SOLUTIONS];
unsigned int num_solutions = 0;
unsigned int v[n];
FILE *fptr = fopen("sortida_aleatori3_nxn.txt", "w");
if (fptr == NULL) {
printf("There was a problem when opening the filen");
exit(1);
}
while (num_solutions < MAX_SOLUTIONS) {
random_vector(n, v);
if (!is_solution(n, v)) {
unsigned long hash = hash_vector(n, v);
bool exists = false;
for (unsigned int i = 0; i < num_solutions; i++) {
if (solutions[i] == hash) {
exists = true;
break;
}
}
if (!exists) {
solutions[num_solutions++] = hash;
for (unsigned int i = 0; i < n; i++) {
fprintf(fptr, "%u ", v[i]);
}
fprintf(fptr, "n");
}
}
}
fclose(fptr);
printf("Total solutions found:: %un", num_solutions);
}
Secondly, we have the random_vector
function, which is nothing but the following:
void random_vector(unsigned int n, unsigned int v[n]) {
for (int i = 0; i < n; i++) {
v[i] = (rand() % n) + 1;
}
}
Then, we have the most important, the hash_vector
function:
unsigned long hash_vector(unsigned int n, unsigned int v[n]) {
unsigned long hash = 0;
for (int i = 0; i < n; i++) {
hash = hash * 31 + v[i];
}
return hash;
}
Finally, all those work together in main
:
int main() {
clock_t start_time = clock();
srand(time(0));
unsigned int n = 12;
all_solutions(n);
clock_t end_time = clock();
double time_taken = ((double)(end_time - start_time)) / CLOCKS_PER_SEC;
printf("Time of execution: %f secondsn", time_taken);
return 0;
}
Execution times for different values of n
So, I’ve tested that for n=11, which we know has an exact value of 2680 solutions,
the execution time is 0.019889 seconds
For n=12, which we know has an exact value of 14200 solutions,
the execution time is 0.246513 seconds
For n=13, with 73712 solutions, the execution time is 5.756419 seconds
For n=14, with 365596 solutions, it is 141.701347 seconds
As you may observe, it is a really big function (determined by n to the power of n), as the number of solutions is expressed by:
number of solutions <= V * (R^n)_n = n^n
Therefore, every time the n gets a little bit larger (+1), the execution time increases significantly.
Now, my question is, without using further complex but more efficient algorithms such as backtracking, can we speed up this process even more? The previous algorithm I used was very slow, so there is a big improvement. However, can I turn it more efficient than it is already?
I thought of different ways, but the one that got my attention is by changing this little random number here, in the hash_vector
function:
hash = hash * 31 + v[i];
I thought 31 was a good choice because of its properties. However, changing this number into a even better one, will it improve the code’s efficiency, or is it just a worthless idea?
If the answer is yes, what conditions of properties we may consider to choose a better number to hash the vector? Is there even a “perfect number” that would be the best choice in here?
Obviously, this algorithm, as I presented before, is not the best efficient as, for instance, the backtracking one.
However, I’m just trying to “play” with vectors, hashes and the properties of some numbers, like 31.
Although my idea was to experiment with that particular number, I didn’t think of any more ideas to make my algorithm more efficient. So, I am open to hear all the possible ideas that you may have, and I will be updating this post from time to time to see if the time’s efficiency has gotten any better (or worse) depending on which ideas.