10- Armed Bandit Test bed using greedy algorithm

This is a script to create a 10 armed bandit testbed using Greedy algorithm
387 Descargas
Actualizado 12 mar 2018

Ver licencia

This was a set of 2000 randomly generated k-armed bandit
problems with k = 10. For each bandit problem, the action values,
q*(a), a = 1,2 .... 10, were selected according to a normal (Gaussian) distribution with mean 0 and
variance 1. Then, when a learning method applied to that problem selected action At at time step t,
the actual reward, Rt, was selected from a normal distribution with mean q*(At) and variance 1.
For any learning method, we can measure its performance and behavior as it improves with experience over
1000 time steps when applied to one of the bandit problems. This makes up one run. Repeating this
for 2000 independent runs, each with a different bandit problem, we obtained measures of the learning
algorithm's average behavior.
We use the sample average technique for action-value estimates and compare the results of a greedy algorithm by plotting the average reward over 2000 simulations. The code can be modified for a non-greedy algorithm as well.

Citar como

Sai Sandeep Damera (2025). 10- Armed Bandit Test bed using greedy algorithm (https://la.mathworks.com/matlabcentral/fileexchange/66467-10-armed-bandit-test-bed-using-greedy-algorithm), MATLAB Central File Exchange. Recuperado .

Compatibilidad con la versión de MATLAB
Se creó con R2017b
Compatible con cualquier versión
Compatibilidad con las plataformas
Windows macOS Linux
Categorías
Más información sobre Statistics and Machine Learning Toolbox en Help Center y MATLAB Answers.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Versión Publicado Notas de la versión
1.0.0.0