Skip to content
VALUE-BASEDFOUNDATIONSPOLICY-BASEDDECISION-TIME PLANNINGBACKGROUND TRAININGBanditsMDPDynamic ProgrammingMC & TDSarsaQ-LearningDQNDouble DQNDueling DQNC51RainbowREINFORCEA2C / A3CTRPOPPODDPGTD3SACDyna-QMPCMCTSAlphaZeroMuZeroMBPODreamer

The Map of RL

A curated lineage of handbook methods. Click a node for the chapter, or click a territory to isolate a family.