Streaming Audio: Apache Kafka® & Real-Time Data

Streaming Audio: Apache Kafka® & Real-Time Data

Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join host Kris Jenkins (Senior Developer Advocate, Confluent) as he streams through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

Episodes

November 29, 2022 29 min

Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network tr...

Mark as Played

What happens when you need to store more than a few petabytes of data? Rittika Adhikari (Software Engineer, Confluent) discusses how her team implemented tiered storage, a method for improving the scalability and elasticity of data storage in Apache Kafka®. She also explores the motivating factors for building it in the first place: cost, performance, and manageability. 

Before Tiered Storage, there was no real way to retain Kafka d...

Mark as Played
November 15, 2022 38 min

In principle, data mesh architecture should liberate teams to build their systems and gather data in a distributed way, without having to explicitly coordinate. Data is the thing that can and should decouple teams, but proper implementation has its challenges.

In this episode, Kris talks to Florian Albrecht (Solution Architect, Hermes Germany) about Galapagos, an open-source DevOps software tool for Apache Kafka® that Albrecht creat...

Mark as Played

Is real-time data streaming the future, or will batch processing always be with us? Interest in streaming data architecture is booming, but just as many teams are still happily batching away. Batch processing is still simpler to implement than stream processing, and successfully moving from batch to streaming requires a significant change to a team’s habits and processes, as well as a meaningful upfront investment. Some are even ru...

Mark as Played

Streaming real-time data at scale and processing it efficiently is critical to cybersecurity organizations like SecurityScorecard. Jared Smith, Senior Director of Threat Intelligence, and Brandon Brown, Senior Staff Software Engineer, Data Platform at SecurityScorecard, discuss their journey from using RabbitMQ to open-source Apache Kafka® for stream processing. As well as why turning to fully-managed Kafka on Confluent Cloud is th...

Mark as Played
October 26, 2022 58 min

What are some recommendations to consider when running Apache Kafka® in production? Jun Rao, one of the original Kafka creators, as well as an ongoing committer and PMC member, shares the essential wisdom he's gained from developing Kafka and dealing with a large number of Kafka use cases.

Here are 6 recommendations for maximizing Kafka in production:

1. Nail Down the Operational Part
When setting up your cluster, in addition to d...

Mark as Played

Is it possible to build a real-time data platform without using stateful stream processing? Forecasty.ai is an artificial intelligence platform for forecasting commodity prices, imparting insights into the future valuations of raw materials for users. Nearly all AI models are batch-trained once, but precious commodities are linked to ever-fluctuating global financial markets, which require real-time insights. In this episode, Ralph...

Mark as Played
October 12, 2022 71 min

Java Virtual Machines (JVMs) impact Apache Kafka® performance in production. How can you optimize your event-streaming architectures so they process more Kafka messages using the same number of JVMs? Gil Tene (CTO and Co-Founder, Azul) delves into JVM internals and how developers and architects can use Java and optimized JVMs to make real-time data pipelines more performant and more cost effective, with use cases.

Gil has deep roots...

Mark as Played

Apache Kafka® 3.3 is released! With over two years of development, KIP-833 marks KRaft as production ready for new AK 3.3 clusters only. On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent) shares highlights of this release, with KIPs from Kafka Core, Kafka Streams, and Kafka Connect. 

To reduce request overhead and simplify client-side code, KIP-709 extends the OffsetFetch API requests to accept mult...

Mark as Played

How do you set data applications in motion by running stateful business logic on streaming data? Capturing key stream processing events and cumulative statistics that necessitate real-time data assessment, migration, and visualization remains as a gap—for event-driven systems and stream processing frameworks according to Fred Patton (Developer Evangelist, Swim Inc.) In this episode, Fred explains streaming applications and how it c...

Mark as Played

What’s your favorite podcast? Would you like to find some new ones? In celebration of International Podcast Day, Kris Jenkins invites 12 experts from the Apache Kafka® community to talk about their favorite podcasts. Unlike other episodes where guests educate developers and tell stories about Kafka, its surrounding technological ecosystem, or the Cloud, this special episode provides a glimpse into what these guests have learned thr...

Mark as Played

How do you build an event-driven application that can react to real-time data streams as they happen? Kris Jenkins (Senior Developer Advocate, Confluent) will be hosting another fun, hands-on programming workshop—Coding in Motion: Watching the River Flow, to demonstrate how you can build a reactive event streaming application with Apache Kafka®, ksqlDB using Python.

As a developer advocate, Kris often speaks at conferences, and the ...

Mark as Played

Processing real-time event streams enables countless use cases big and small. With a day job designing and building highly available distributed data systems, Simon Aubury (Principal Data Engineer, Thoughtworks) believes stream-processing thinking can be applied to any stream of events. 

In this episode, Simon shares his Confluent Hackathon ’22 winning project—a wildlife monitoring system to observe population trends over time using...

Mark as Played

How do you analyze Reddit sentiment with Apache Kafka® and microservices? Bringing the fresh perspective of someone who is both new to Kafka and the industry, Shufan Liu, nascent Developer Advocate at Confluent, discusses projects he has worked on during his summer internship—a Cluster Linking extension to a conceptual data pipeline project, and a microservice-based Reddit sentiment-analysis project. Shufan demonstrates that it’s p...

Mark as Played

How do you plan Apache Kafka® capacity and Kafka Streams sizing for optimal performance? 

When Jason Bell (Principal Engineer, Dataworks and founder of Synthetica Data), begins to plan a Kafka cluster, he starts with a deep inspection of the customer's data itself—determining its volume as well as its contents: Is it JSON, straight pieces of text, or images? He then determines if Kafka is a good fit for the project overall, a de...

Mark as Played

Reimagining a data architecture to provide real-time data flow for sporting events can be complicated, especially for organizations with as much data as World Table Tennis (WTT). Vatsan Rama (Director of IT, ITTF Group) shares why real-time data is essential in the sporting world and how his team reengineered their data system in 18 months, moving from a solely on-premises infrastructure to a cloud-native data system that uses Conf...

Mark as Played

Inheriting software in the banking sector can be challenging. Perhaps the only thing harder is inheriting software built by a committee of banks. How do you keep it running, while improving it, refactoring it, and planning a bigger future for it? In this episode, Jean-Francois Garet (Technical Architect, Symphony) shares his experience at Symphony as he helps it evolve from an inherited, monolithic, single-tenant architecture to an...

Mark as Played
August 10, 2022 39 min

Security is a primary consideration for any system design, and Apache Kafka® is no exception. Out of the box, Kafka has relatively little security enabled. Rajini Sivaram (Principal Engineer, Confluent, and co-author of “Kafka: The Definitive Guide” ) discusses how Kafka has gone from a system that included no security to providing an extensible and flexible platform for any business to build a secure messaging system. She shares c...

Mark as Played

Java Database Connectivity (JDBC) is the Java API used to connect to a database. As one of the most popular Kafka connectors, it's important to prevent issues with your integrations.

In this episode, we'll cover how a JDBC connection works, and common issues with your database connection.

Why the Kafka JDBC Connector?

 When it comes to streaming database events into Apache Kafka®, the JDBC connector usually represents the fi...

Mark as Played

Setting up a reliable cloud networking for your Apache Kafka® infrastructure can be complex. There are many factors to consider—cost, security, scalability, and availability. With immense experience building cloud-native Kafka solutions on Confluent Cloud, Justin Lee (Principal Solutions Engineer, Enterprise Solutions Engineering, Confluent) and Dennis Wittekind (Customer Success Technical Architect, Customer Success Engineering, C...

Mark as Played

Popular Podcasts

    Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations.

    Crime Junkie

    If you can never get enough true crime... Congratulations, you’ve found your people.

    The Piketon Massacre

    The most notorious mass murder in Ohio’s history happened on the night of April 21, 2016 in rural Pike County. Four crime scenes, thirty-two gunshot wounds, eight members of the Rhoden family left dead in their homes. Two years later a local family of four, the Wagners, are arrested and charged with the crimes. As the Wagners await four back-to-back capital murder trials, the KT Studios team revisits Pike County to examine: crime-scene forensics, upcoming legal proceedings, and the ties that bind the victims and the accused. As events unfold and new crimes are uncovered, what will it mean for all involved? What will it mean for Pike County?

    Stuff You Should Know

    If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks then look no further. Josh and Chuck have you covered.

    Morbid

    It’s a lighthearted nightmare in here, weirdos! Morbid is a true crime, creepy history and all things spooky podcast hosted by an autopsy technician and a hairstylist. Join us for a heavy dose of research with a dash of comedy thrown in for flavor.

Advertise With Us

For You

    Music, radio and podcasts, all free. Listen online or download the iHeart App.

    Connect

    © 2022 iHeartMedia, Inc.