# How does one verify the results of simulation from genetic algorithm

5 views (last 30 days)
Kah Wei on 5 Aug 2014
Commented: Geoff Hayes on 9 Aug 2014
Recently I am using genetic algorithm to optimize a 2 by 2 controller.The optimization runs well and I manage to obtain the result.
However from the simulation result it is shown that the mean fitness value does not change much during the simulation as in it is almost a straight line for the whole simulation.I tried to change a few options like crossover,mutation,selection and even increase the population and the range of the initial population but there is no change on how the mean fitness proceed during the simulation.I attached a diagram here for illustration.I read a few examples on MATLAB and usually the way mean fitness proceed is almost like the best fitness value where there is a slight curve in a decreasing manner as we proceed from one generation to another generation.So my questions are below
1)Is it normal for the mean fitness value of my optimization process to behave in such a manner?If it is an indication of bad result,how can I proceed to improve the result of the simulation?
2)How does one confirm or verify that the results from the optimization via GA (genetic algorithm) is good?
Please kindly give me some guidance on this.Thanks.
##### 2 CommentsShowHide 1 older comment
Kah Wei on 6 Aug 2014
Hi Geoff.Nice to talk to you again.
The changes that I made on the options of the genetic algorithm are shown in the program that I attached (calculation_error2.m)
On multiple runs (5 runs to be exact) I am getting almost the same reading for the best fitness and the mean fitness value.I also initiate rng ('shuffle') as recommended by you.The value for best fitness also stay roughly the same during the simulation as shown on the command window from the diagnostic information.
How can I modify the parameters to avoid premature convergence and get the best solution?Is there any scientific way to achieve that rather than trial and error which is time consuming and not efficient?

Geoff Hayes on 6 Aug 2014
Hi Kah,
Five runs without much change could mean either that your code has found the optimal solution or it is just getting stuck.
I noticed that you are using the following for the crossover
options = gaoptimset(options,'CrossoverFraction',0.8,...
'CrossoverFcn',@crossovertwopoint);
According to the documentation (see crossover options), this crossover function works as follows
Two point (@crossovertwopoint) selects two random integers m and n between 1 and Number of variables. The function selects
Vector entries numbered less than or equal to m from the first parent Vector entries numbered from m+1 to n, inclusive, from the second parent Vector entries numbered greater than n from the first parent. The algorithm then concatenates these genes to form a single gene.
For example, if p1 and p2 are the parents [a b c d e f g h] and [1 2 3 4 5 6 7 8] respectively, and the crossover points are 3 and 6, the function returns the following child [a b c 4 5 6 g h].
So while it does produce a child that is different from either parent, i.e. the child has a subset of the variables from both parents, it does not change the variables at all - they still have the same values as the parent variables. This means that the variable values will only change due to a mutation.
I suggest you try the crossoverintermediate, crossoverarithmetic, or crossoverheuristic. All three indicate that the child variable will be some combination of the two from the parents - so it won't necessarily be the same. The first two should ensure (if you leave the default Ratio as is) that the resulting child variable is still within the interval defined for that variable. The crossoverheuristic may not do that unless you set the Ratio to be one or less. It has an interesting algorithm whereby the child variable will be closer to the variable of the parent with the better fitness. So you could try this one first with a ratio of 0.95 say. The default ratio is 1.2 so you could leave as is, though your new child variable value may fall outside of the interval (unless the GA code can be told to enforce intervals).
_____________________________
I also noticed that the intervals for your 11 variables are very narrow. If you look at the difference between the lower and upper bound for each one, we see them as
1.123
3.38
0.999
0.68177
2.476
1.049
0.3739
2.193
0.3287
1.0671
0.691
For the seventh and ninth variables, the difference between their lower and upper bounds are 0.3739 and 0.3287 respectively. I don't know what either variable represents, but I just wonder what would happen if you fix both to something and then reduce your problem to a nine variable optimization problem, would that lead to a better solution?
Try changing the crossover operator to the heuristic one - leave the default ratio of 1.2 and see what happens. Continue to decrease it on subsequent runs to 0.95, 0.85,...,0.45 (maybe no lower than this as then the parent with the worse fitness will have its variables weighted more). If you see no change in the convergence of your solution, then try reducing your problem from 11 variables to 10, to 9 for those with the small intervals.
_____________________________
By the way, how long does it take one run of the GA (so population 3000, 100 iterations/generations)?
Geoff Hayes on 9 Aug 2014
Awesome that you got some good results, Kah!
For the "penalty", I had meant something along the lines of: suppose that x1,x2,x3,…,x11 are the optimized values for each of the 11 values. If you pass these 11 variables into the fitness function, the scalar result is y (say). And you know all of this from your previous runs/simulations. Now suppose that a parent or child has 11 variables that are considered near-identical to x1,x2,…,x11 i.e. if x is the input vector to the fitness function, then within this function do
tolerance = 0.000001;
if abs(x1-x(1))<tolerance && ...
abs(x2-x(2))<tolerance && ...
%etc.
abs(x11-x(11))<tolerance
% the input variables are considered near-identical to the
% optimal solution found on previous runs, so just double
% the fitness function value
outFitnessVal = 2*y;
end
So now, this parent or child has a fitness function value that is double what it should be, and so it may be discouraged from "reproducing" and so another solution may become optimal.
Since your results improved with varying the mutation rate and the crossover ratio, you can now ignore this. Though it would be interesting to see what happens if you run 100s of simulations.
As for The value for best fitness value and the mean fitness value are still quite close though, I think that this is expected - over time, the population will converge to the optimal solution, so the average should start to edge closer to this optimal solution.

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!