Suggested Searches

AI/ML STIG Lecture Series

Artificial Intelligence and Machine Learning Science and Technology Interest Group (AI/ML STIG)

Module 8: Reinforcement Learning

Location

Virtual

Dates

11 May 2026
4:00pm ET

Community

AI/ML STIG

Type

Seminar

Reinforcement Learning Fundamentals

Speaker

Carol Cuesta-Lazaro, IAS/Flatiron

An introduction to reinforcement learning from its classical foundations to its central role in modern large language models. Trace the arc from TD-Gammon and AlphaGo through DQN and multi-agent emergent behavior to RLHF and RLVR, and learn how policy gradients turn a non-differentiable reward signal into a usable training objective.

Topics Covered

A brief history of RL: TD-Gammon, AlphaGo, DQN, RLHF, and RLVR
The agent-environment loop: states, actions, policies, and rewards
How RL differs from supervised learning: shifting data, exploration vs. exploitation, evaluative and delayed rewards
The non-differentiability of the learning problem and policy gradients
The REINFORCE estimator, baselines, and reward-to-go for variance reduction
Reinforcement Learning from Human Feedback (RLHF) for instruction-tuned models
Reinforcement Learning from Verifiable Rewards (RLVR) and reasoning in LLMs
Open questions: is RL teaching new capabilities or sharpening existing ones?

Session Recording

Downloads

Reinforcement Learning: Classical Foundations and the LLM era

May 12, 2026

PDF (7.78 MB)

Seminar Connection

Join the Seminar