Téma:Risk-Aware Data-Driven Reinforcement Learning
Vedoucí:Karel Macek; Garant: Doc. Ing. Karel Zimmermann Ph.D.
Vypsáno jako:Diplomová práce,Bakalářská práce
Popis:PDF version
Reinforcement Learning has been successfully applied in various domains, ranging from robotics to algorithmic trading. This powerful class of algorithm is rooted in dynamic programming and benefits from increased amounts of data as well as advanced regression techniques. The common practice is to focus on maximization of expected reward. However, especially in finance, other risk measures are considered, such as Value at Risk. There have been attempts to make reinforcement learning risk-aware in the past decade. The intended project will focus on the existing approaches and their extension for fast calculation in continuous domains.
- Set up 1-2 simulation environments (preferably: one from robotics, one from finance)
- Combine the (i) risk-aware decision making (ii) Fitted Q-iteration by advantage weighted regression
- Compare with other methods (discretized risk-aware reinforcement learning, standard Fitted Q Iteration)
- Your interest in mathematics and machine learning is assumed
- Python – no problem if you will learn it on the fly
- The balance between theoretical foundations and experimentation can be adjusted with respect to your preferences
Literatura:García J: A Comprehensive Survey on Safe Reinforcement Learning, Journal of Machine Learning Research 16 (2015) 1437-1480. Morimura, T: "Nonparametric return distribution approximation for reinforcement learning." Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010. Neumann, G: "Fitted Q-iteration by advantage weighted regression." Advances in neural information processing systems. 2009.
Vypsáno dne:06.04.2017