AI/ML STIG Lecture Series
Artificial Intelligence and Machine Learning Science and Technology Interest Group (AI/ML STIG)
Module 7: Reinforcement Learning
Location
Virtual
Dates
11 May 2026
4:00pm ET
Community
AI/ML STIG
Type
Seminar
Reinforcement Learning Fundamentals
Speaker
Carol Cuesta-Lazaro, IAS/Flatiron
An introduction to reinforcement learning from its classical foundations to its central role in modern large language models. Trace the arc from TD-Gammon and AlphaGo through DQN and multi-agent emergent behavior to RLHF and RLVR, and learn how policy gradients turn a non-differentiable reward signal into a usable training objective.
Topics Covered
- A brief history of RL: TD-Gammon, AlphaGo, DQN, RLHF, and RLVR
- The agent-environment loop: states, actions, policies, and rewards
- How RL differs from supervised learning: shifting data, exploration vs. exploitation, evaluative and delayed rewards
- The non-differentiability of the learning problem and policy gradients
- The REINFORCE estimator, baselines, and reward-to-go for variance reduction
- Reinforcement Learning from Human Feedback (RLHF) for instruction-tuned models
- Reinforcement Learning from Verifiable Rewards (RLVR) and reasoning in LLMs
- Open questions: is RL teaching new capabilities or sharpening existing ones?
Session Recording
Meeting Connection
News Straight to Your Inbox
Subscribe to your community email news list
We will never share your email address.


