Suggested Searches

AI/ML STIG Lecture Series

Artificial Intelligence and Machine Learning Science and Technology Interest Group (AI/ML STIG)

AI/ML STIG about AI/ML STIG Lecture Series

Location

Virtual

Dates

9 February 2026
4:00pm ET

Community

AI/ML STIG

Type

Seminar

Transformers

Speaker

Helen Qu, Flatiron

Build a decoder-only transformer (a small GPT-like language model) from scratch in PyTorch. Train it on the Tiny Shakespeare dataset for character-level language modeling and use it to generate text, understanding every component along the way.

Topics Covered:

  • Self-attention as a learned, data-dependent mixing operator
  • Causal (masked) self-attention for autoregressive modeling
  • Building a GPT-style Transformer block from scratch
  • Token and positional embeddings
  • Training a small autoregressive language model
  • Text generation with temperature and top-k sampling
Session Recording

Meeting Connection

Join the Meeting

News Straight to Your Inbox

Subscribe to your community email news list

We will never share your email address.

Sign Up
Angled from the upper left corner to the lower right corner is a cone-shaped orange-red cloud known as Herbig-Haro 49/50. This feature takes up about three-fourths of the length of this angle. The upper left end of this feature has a translucent, rounded end. The conical feature widens slightly from the rounded end at the upper right down to the lower right. Along the cone there are additional rounded edges, like edges of a wave, and intricate foamy-like details, as well as a clearer view of the black background of space. In the upper left, overlapping with the rounded end of Herbig-Haro 49/50, is a background spiral galaxy with a concentrated blue center that fades outward to blend with red spiral arms. The background of space is speckled with some white stars and smaller, more numerous, fainter white galaxies throughout.