I’m programming Genetic Algorithm in C++ and after searching all kind of ways of doing GA’a operators (selection, crossover, mutation) I came up with a doubt.
Let’s say I have an initial population of 500. My selection will consist in getting the top 20% of 500(based on best fitness). So I get 100 individuals to mate. When I do the crossover I’ll get 2 children where both together have 50% of surviving. So far so good. I start the mutation, and everything’s ok.. Now when I start choosing the Next generation, I see that I have a big number of children (in this case, 4950 if you wanna know). Now the thing is, every time I run GA, if I send all the children to the next generation, the number of individuals per generation will increase exponentially.
So there must be a way of choosing the children to fulfill a new generation without getting out of this range of the initial population.
What I’m asking here is if there is anyway of choosing the children to fill the new generations OR should I choose somehow (and maybe reduce) the parents to mate so I don’t get so many children in the end.
6
After that you have generated your initial population (the pool should be quite large) and you apply your fitness function to it, you select your parents for the next generation.
Once that you have your parents, you discard the other individuals so that you can replace them with the new generation. This replacing will keep your population size in control, after all, if individual I
did not fit the fitness criteria to contribute to the next generation, why do you need to keep it?
Note however, that in your study, there is a high chance that your algorithm will converge to a local maxima/minima very quickly. This is because you only keep the top 20% of your population for mating. This is usually a bad idea since it will get your GA stuck in a local maxima/minima. To fix this, you would also include some of worse solutions as well, say for instance, the top 20% and the worse 5-10%.
EDIT: Alternatively, you could also go for something akin to what @ Jeff Langemeier proposes and instead of selecting the worse 10% of the population, you randomly select a given amount from the non best (in this case, the remaining 80%) individual.
5
Choosing the children that are fit for the next generation of mating is the same fitness calculation that made their parents fit. Also, at the end of the current generation, you should not have more children than the initial population. Remember, this is not a free-for-all but survival of the fittest.
You are picking a highly selected group that are fit to mate from; essentially discarding up to 75 – 80% or more of your initial population (search space) or however much you need to ensure only the fittest mate.
A genetic algorithm should be run until you have exhausted the search space or, in other words, until there are no more mating pairs and, hopefully, that last offspring or very small group of offspring yield the answer you want.
This will require you to tune the fitness, crossover, and mutation factors. I recall when I wrote a GA to solve how the arithmetic operators (*, /, +. -) and the numeric values (0-9) are combined to form a randomly generated value such as 45. I represented my chromosomes as x-bit binary values that contained every operator and the number 0-9. They were heavily randomized through a heuristic to ensure as much variation as possible.
I had to tweak selection, crossover, and mutation just to solve the problem but it could be solved. If you see populations going out of control, something is not right in your algorithm. I recall, from an initial population of 100,000 I lost between 50 – 75k that were not fit to mate.
Play with it some more; you understand how it should work and I am sure you will get it.
4
Before I answer your question I have to ask you.
When you run your GA, did you actually see a gain in your solution?
In other words, after each iteration the new generations that you have are actually better than the previous ones?
If you do, then I am shockingly surprised because
- As Mushy have pointed out very correctly: your GA is just a sugarcoating of simulated annealing (or Monte Carlo)
-
As entropy increases because you just mix-and-match, the search space to be explored will keep increasing. And as we are well know, after a few generations, your solutions will degenerate, the only two things that I can see keep that from happening are that
(a) you have a set of rules on any or the operations you have..
cross-over, selection, mutation, or others if you are creative
enough..(b) divine intervention, you stick your human hands in and
intervene at some or all of the iterations (done through some added
heuristic in fitness function, or you can have at iteration 1 do this, 2 do that, …)
Disclaimer: I am no expert in this area, but even people like me know you have to put heuristic in there somewhere.
2