Actor-Critic & Continuous ControlTD3 and SACTwin Delayed DDPG and Soft Actor-Critic with maximum entropy RL.Copy MarkdownOpenPlaceholder content for TD3 and SAC.DDPGDeep Deterministic Policy Gradient for continuous action spaces.Dyna and Learned ModelsIntegrated learning, planning, and acting with learned environment dynamics.