All Episodes

April 25, 2025 14 mins

DFloat11 (DF11) is a game-changer for GPU inference, delivering lossless compression for LLMs by smartly targeting redundant BF16 exponent bits and applying Huffman coding. Unlike lossy 8-bit quantization, DF11 guarantees identical outputs while shrinking model sizes by ~70%, enabling bigger batches, longer contexts, and more efficient GPU memory use. Decompression overhead? Minimal—still faster than CPU offloading. Avobot.com supercharges your AI stack with flat-rate, unlimited access to GPT-4o, Gemini, Claude, DeepSeek, and more via a single API key. To start building, visit Avobot.com.

Mark as Played

Advertise With Us

Popular Podcasts

24/7 News: The Latest
Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.