Effect of rng on Genetic Algorithm

6 visualizaciones (últimos 30 días)
Mirhan
Mirhan el 9 de Ag. de 2025
Comentada: Walter Roberson el 20 de Ag. de 2025
Dear all,
I’m working with a MATLAB genetic algorithm optimization code. Initially, I ran the algorithm without specifying an RNG seed, and as expected, each run converged to slightly different points. To address this, I used rng(0,"twister"), which allowed me to reproduce the same results for every run for same parameters.
However, I noticed that these results were not close to the known global optimum. When I changed the seed to rng(100,"twister"), the algorithm converged to the global optimum.
On the other hand, for a different optimization case, using a seed value of 100 did not yield the optimum. This suggests that the optimal seed may be unique to each combination of design parameters, which would mean running multiple GAs to identify the best seed value.
My question is: is there a way to ensure both reproducibility and convergence to the global optimum?
Best regards,
Mirhan
seed = 100;
rng(seed,"twister"); % Fix the random stream for reproducibility.
format short;
  1 comentario
Stephen23
Stephen23 el 9 de Ag. de 2025
Editada: Stephen23 el 19 de Ag. de 2025
"is there a way to ensure both reproducibility and convergence to the global optimum?"
No such thing exists. If such a deterministic solution existed then everyone would use it... and there would not be the millions of (mostly animal-inspired) optimisation routines that are essentially all derivations from GA.
Where is the minimum of this function ALMOSTIMPOSSIBLE? What heuristic could be used to find it?
fplot(@almostimpossible)
function y = almostimpossible(x)
y = x.^2;
y(x==sqrt(pi)) = -1;
end
Your question indicates some confusion about stochastic optimization routines. Genetic algorithms are metaheuristics that use randomness to explore the search space. Different random seeds lead to different exploration paths, and some paths happen to find better solutions than others. In general there's no way to predict which seed will work best for a given problem without actually running the optimization.
The approach of "running multiple GAs to find the best seed" defeats the purpose of reproducibility - you'd need to test many seeds to find the good one, but then you're essentially doing multiple random runs anyway, tending towards brute-force.
Your options are:
  1. Reproducible but potentially suboptimal: Fix one seed and accept whatever local optimum it finds
  2. Non-reproducible but better exploration: Run multiple times with different seeds and take the best result
  3. Compromise: Run a fixed set of predetermined seeds and report statistics (mean, best, worst)
The fundamental issue is that if there were a deterministic way to guarantee finding the global optimum, we wouldn't need genetic algorithms - we'd use that deterministic method instead. The randomness isn't a bug to be eliminated; it's the core feature that allows these algorithms to escape local optima.

Iniciar sesión para comentar.

Respuestas (2)

Walter Roberson
Walter Roberson el 9 de Ag. de 2025
There is no general way using ga() to ensure repeatability and convergence to the global optimum.
With some particular forms of functions (for example, quadratics) it has been shown that ga() will always converge within a particular number of steps, but if you do not happen to be using one of those function forms then there is no promise that ga() will ever converge to the global optimum.
If you have a function such as a shortest-path then it is common for ga() to get caught needing to exchange two branches to achieve the shortest path, and yet for it to be very very unlikely to be able to find the right nodes to exchange. The expected time to find the right nodes to exchange might exceed the time until the Sun swells into a Red Giant and destroys the Earth.
A lot of the time, it is not known what the global optimum is.
  1 comentario
Walter Roberson
Walter Roberson el 20 de Ag. de 2025
Imagine that you start somewhere flat such as the prairies of Canada, and you start drilling a 10 cm borehole downwards. Imagine that as you dig, you let the dirt pile up around the hole, so the deeper the hole the taller the hill around the hole.
Now imagine using a genetic algorithm to try to find the deepest place on Earth. If you start from nearly anywhere on Earth, you are very unlikely to get anywhere near to trying to climb the hill around the borehole. If you do happen to start "near" the hill, you are unlikely to climb the entire hill to encounter the 10 cm borehole. If you do happen to try borehole, then if your steps are too large then you will probably step outside the hill outside the borehole and try descending the hill instead of following the borehole. You need to have started from just the right place and be moving slowly enough to descend the borehole.
It is not impossible that everything happened to align so that you try just the right area of the borehole at just the right range of speeds, but the probability is low. You would be far more likely to explore the Mariana Trench than the borehole.
Russia dug the Kola Superdeep Borehole 12,262 meters deep... but it has been sealed up since 2005. Does it count for there to be a super-deep hole if there is a lid on it?

Iniciar sesión para comentar.


John D'Errico
John D'Errico el 9 de Ag. de 2025
Editada: John D'Errico el 9 de Ag. de 2025
Let me expad on what @Walter Roberson said. There is no optimum seed, that is one seed that will be perfect for any problem. In fact, when you specify rng, you start the random sequence in some completely arbitrary way. At that point, the sequence of random numbers you get will now be perfectly repeatable. For example...
rng(0,'twister')
rand
ans = 0.8147
rand
ans = 0.9058
If I call rand again, I will get another random number, a different one. BUT, if I use the same seed again, I will restart the sequence.
rng(0,'twister')
rand(1,3)
ans = 1×3
0.8147 0.9058 0.1270
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Again, the same sequence. Everything gets reset, when I reset the seed.
And when you do that, effectively, you have chosen every successive random number that GA uses. The result will be whatever comes out the end. But you have asolutely no control over the result. It may be good. It may be complete crap. If you change the seed to some other number, you will get a different result, based on the different starting state that GA will see. But there is ABSOLUTELY no way to predict what will happen, at least not based on the seed.
And for that you need to understand the concept of a basin of attraction for an optimizer. The analogy I like to use is that of setting a blind person down on the face of the earth, and telling them to find the point of lowest altitude. All they are given is a cane and an altimeter, one that will magically work under water. Ok, give them scuba gear too. Your blind investigator must walk downhill, based only on what they can interpret locally, using ONLY the cane to guide them as to what is downhill.
How well do you think your blind searcher will do? Now, try the experiment again, but this time from the EXACT same location, facing initially in exactly the same direction? Sorry, but they will get to the same spot at the end, at a point where no direction they walk will be downhill. That soluution may be a locally minimum. In fact, there is a very good chance it will be. Maybe you dropped them in some valley depression to start with. Maybe near the Dead Sea.
Now restart it again, but this time, your random start point might be at the top of Mt Everest. Yes, they will go downhill fast from there. But it is likely they will now get stuck, maybe in the depths of the Indian ocean. It will be a low spot probably. But even so, it will not be the lowest point on earth.
Every optimizer has the same thing. It is called a basin of attraction. Every possible local solution has what is called a "basin of attraction". That is the set of start points that will result in that given solution. And each final point you will arrive at must have started from some point in the associated basin of attraction. This is true, even for "random" tools like GA. Start them at a given point, with the random seed known, and they will end at a given final point.
And for a random start, you have NO control over the final point. There is no magically good random start. You will get lucky, or not.
Tools like GA are designed to be moderately robust against poor starts. but even so, they will often end at a sub-optimal point, Again, start in a bad basin, and you will not be happy. But changing the random seed, and hoping GA will do better is just a flip of a coin. There is NO good seed, NO bad seeds. (Well, personally, I like 17 as a good seed. But I am biased. And some will claim 42 is a good seed. It might be the answer to everything, after all.)
There is NO magic seed that will ever insure convergence to the global optimum. And even though some guy named Jack had what were purportedly magic seeds, you really don't want to climb that beanstalk.

Categorías

Más información sobre Statistics and Machine Learning Toolbox en Help Center y File Exchange.

Productos


Versión

R2025a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by