- Machine learning
In reinforcement learning, agents take actions within an environment. Usually, both the agent and environment states change in reaction to this action. A reward is given to the agent to tell it if the action was positive or negative.
The goal of a learning agent is to act so as to maximize that reward.
An agent can be anything from a fixed set of
if-else statements to a deep neural network.
Evolutionary strategies in RL
A survey of evolutionary strategies for RL (Müller and Glasmachers 2018).
Other/Misc algorithms, hacks and tricks
Current RL is full of tricks to make the algorithms behave the way we want them to. It is not clear if the algorithms are getting better overall thanks to that collection of tricks or if this makes them over-specialized for a particular type of application.
Exploration bonuses are a class of methods that encourage an agent to explore even when the environment reward is sparse. This is done by adding an extra reward term. This may help an agent explore more states that are visually different from the ones before, or with different histories, etc.
- Burda, Yuri, Harrison Edwards, Amos Storkey, and Oleg Klimov. 2018. "Exploration by Random Network Distillation". arXiv Preprint arXiv:1810.12894.
- Müller, Nils, and Tobias Glasmachers. July 2018. “Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies”. arXiv:1806.01224 [Cs], July.