Skip to content
RL Handbook

References

Books, papers, courses, notes, and code resources used while writing the handbook.

This page collects the materials used to write and check the handbook chapters. There are some great resources which can help you to be better at RL.

Books

Courses, Notes, and Code

  • VachanVY. Reinforcement-Learning
    A very strong implementation resource with good coverage of the basic algorithms: policy iteration, value iteration, Monte Carlo, SARSA, Q-learning, PPO, DDPG, and SAC. Great for checking the mechanics in small executable code.

  • Yandex Data School. Practical RL
    An open RL course with lectures, seminar notebooks, homework-style material, and Colab-friendly practical exercises across classical and deep RL.

  • Tim Miller. Mastering Reinforcement Learning: Markov Decision Processes
    Clear online notes on MDPs, policies, Bellman equations, policy extraction, and partially observable MDPs, useful as a second explanation for the foundations.

  • Boyu AI. Hands-on Reinforcement Learning
    A practical course-style resource with chapter text, slides, videos, and runnable notebook links, moving from tabular RL to deep RL and selected advanced topics.

  • OpenAI Spinning Up. Spinning Up in Deep RL
    A compact deep RL reference with introductions, key papers, pseudocode, and implementation notes for policy gradients, TRPO, PPO, DDPG, TD3, and SAC.

  • AdithyaSK. The ultimate guide to RL environments: building and scaling them in the LLM era Great resource for understanding RL environments: how to design them, scale them, evaluate them, and think about them in modern LLM-oriented RL setups.

Papers