All Episodes

March 11, 2024 86 mins

172: Transformers and Large Language Models


Intro topic: Is WFH actually WFC?

News/Links:


Book of the Show


Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h


Tool of the Show


Topic: Transformers and Large Language Models

  • How neural networks store information
    • Latent variables
  • Transformers
    • Encoders & Decoders
  • Attention Layers
    • History
      • RNN
        • Vanishing Gradient Problem
      • LSTM
        • Short term (gradient explodes), Long term (gradient vanishes)
    • Differentiable algebra
    • Key-Query-Value
    • Self Attention
  • Self-Supervised Learning & Forward Models
  • Human Feedback
    • Reinforcement Learning from Human Feedback
    • Direct Policy Optimization (Pairwise Ranking)



★ Support this podcast on Patreon ★
Mark as Played

Advertise With Us

Popular Podcasts

Dateline NBC
The Nikki Glaser Podcast

The Nikki Glaser Podcast

Every week comedian and infamous roaster Nikki Glaser provides a fun, fast-paced, and brutally honest look into current pop-culture and her own personal life.

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2024 iHeartMedia, Inc.