All Episodes

July 14, 2025 30 mins

This episode features a fascinating conversation with Sid Sheth, CEO and Co-Founder of d-Matrix. With a deep background in building advanced systems for high-performance workloads, Sid and his team are at the forefront of AI compute innovation—specifically focused on making AI inference more efficient, cost-effective, and scalable for enterprise use. Host Seth Earley dives into Sid’s journey, the architectural shifts in AI infrastructure, and what it means for organizations seeking to maximize their AI investments.

Key Takeaways:

  • The Evolution of AI Infrastructure: Sid breaks down how the traditional tech stack is being rebuilt to support the unique demands of AI, particularly shifting from general-purpose CPUs to specialized accelerators for inference.
  • Training vs. Inference: Using a human analogy, Sid explains the fundamental difference between model training (learning) and inference (applying knowledge), emphasizing why most enterprise value comes from efficient inference.
  • Purpose-built Accelerators: d-Matrix’s approach to creating inference-only accelerators means dramatically reducing overhead, latency, energy consumption, and cost compared to traditional GPU solutions.
  • Scalability & Efficiency: Learn how in-memory compute, chiplets, and innovative memory architectures enable d-Matrix to deliver up to 10x lower latency, and significant gains in energy and dollar efficiency for AI applications.
  • Market Trends: Sid reveals how, although today’s focus is largely on training compute, the next five to ten years will see inference dominate as organizations seek ROI from deployed AI.
  • Enterprise Strategy Advice: Sid urges tech leaders not to be conservative, but to embrace a heterogeneous and flexible infrastructure strategy to future-proof their AI investments.
  • Real-World Use Cases: Hear about d-Matrix’s work enabling low-latency agentic/reasoning models, which are critical for real-time and interactive AI workloads.

Insightful Quote from Sid Sheth:

“Now is not the time to be conservative and get comfortable with choice. In the world of inference there isn’t going to be one size fits all... The world of the future is heterogeneous, where you’re going to have a compute fleet that is augmented with different types of compute to serve different needs.”

Tune in to discover how to rethink your AI infrastructure strategy and stay ahead in the rapidly evolving world of enterprise AI!

Thanks to our sponsors:

Mark as Played

Advertise With Us

Popular Podcasts

Stuff You Should Know
24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Crime Junkie

Crime Junkie

Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by audiochuck Media Company.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.