A new technical report on temporal difference (TD) learning for games and "self-play" algorithms for game-agent training is available. This report by Wolfgang Konen features a gentle introduction to TD learning for game play and gives hints for the practioner on the implementation of such algorithms . It shows the references to the most recent applications in this field and discusses in an appendix the more advanced topic of eligibility traces and how and why they work.
This report should be a help for people starting new in the field of TD learning for games and for people who work already in this field but struggle with specific details. It is an updated English translation of an earlier report in German language.
Reinforcement Learning for Board Games: The Temporal Difference Algorithm. Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Sciences, 2015.
Ms. Samineh Bagheri has won the 3rd prize in the annual Erzquell award with her master thesis „Efficient Surrogate Assisted Optimization for Constrained Black-Box Problems“. My most cordial congratulations to her!
Many real-world optimization problems are dealing with constraints. The valid solution for constrained optimization problems (COP) lies somewhere in the feasible region which is a sub-set of the input-space. The borders of the feasible area are defined by one or many constraints. Existance of constraints makes the optimization problems more demanding. There are different approaches to handle the constraints but none of them can outperform all others for all different types of COPs.
There are only a few constraint handlers which work on the basis of fixing a good "infeasible solution" and generating feasible solutions by using the information coming from the infeasible candidates. We proposed a new technique to repair infeasible solutions. The correspoding work was accepted as a paper in GECCO 2015. A talk about this work will be given in GECCO conference on 11-15 of July in Madrid.
The DAAD prize of Cologne University of Applied Sciences (CUAS) goes this year to Campus Gummersbach. The Iranian master student Samineh Bagheri from the Master programme "Automation & IT" is awarded with this prize. Cordial congratulations!
© Schmülgen/FH Köln
More information on this year's DAAD prize are found in the official press release of CUAS (in German). Mrs. Samineh Bagheri currently writes her master thesis in the context of my research project MONREP and she works in MONREP as a student research assistant as well.
The 24th Workshop Computational Intelligence 2014, an annual conference held by Computational Intelligence (CI) Chapter of VDI-GMA (Gesellschaft für Mess- und Automatisierungstechnik) in Dortmund, has attributed the Young Author Award to Patrick Koch, PhD, scientific member of my research group at Campus Gummersbach. I am very happy for him and congratulate him cordially!
Mr. Markus Thill has won the first price in the 2012 OPITZ CONSULTING “Innovation in Informatics” contest. Many congratulations from the CIOP team!!
Mr. Thill’s thesis advanced the state of the art in reinforcement learning for complex board games, here Connect Four. Read more about his work on this page.
The CIOP team is proud to announce that the latest version (V 0.9.0, February 2013) of the Tuned Data Mining in R (TDMR) package is now available for download on the Comprehensive R Archive Network (CRAN).
Download the new released version from this link.
Title: Self-Adaptive Algorithms for Finding Robust Optima: Promises and Limitations
Time: Fr., Oct, 26th, 2012, 11:00-11:45,
Place: Room 0.214
Many problems in engineering design deal with locating optimal parameter configurations for systems. Evolution strategies provide a robust framework for this. This talk deals with the question of how we can find optima that are robust to stochastic perturbations of the input variables and to noise on the output variables. A bifurcation-based classification of types of robust optima is provided, viewing the integration of robustness as a Weierstrass transformation. Based on dynamical systems analysis of evolution strategies the limits of self-adaptive schemes for controlling the sample size of self-adaptive robust evolution strategies are shown. Finally, some recently developed efficient archiving and modeling strategies for speeding up optimization with costly evaluations are highlighted.
Publication: Wolfgang Konen, Patrick Koch, How slow is slow? SFA detects signals that are slower than the driving force, In: B. Filipic, J. Silc (eds.), Proc. 4th Int. Conf. on Bioinspired Optimization Methods and their Applications, BIOMA 2010, May 2010, Ljubljana, Slovenia (PDF)
Publication: Oliver Flasch, Thomas Bartz-Beielstein, Artur Davtyan, Patrick Koch and Wolfgang Konen, Comparing SPO-tuned GP and NARX Prediction Models for Stormwater Tank Fill Level Prediction. In P. Sobrevilla (ed.), Proc. WCCI, July 2010, Barcelona (PDF)