Tag: kafka

Poison Pills in Kafka (I)

What is a poison pill? A “poison pill” is a record that always fails when consumed, no matter how many times it is attempted. They come in different forms: Corrupted records. Records that make your consumer deserializer fail (e.g., an Avro record whose writer schema is not compatible with the consumer reader schema). The problem with a poison pill is unless the consumer eventually handles it, it blocks the consumption of the topic/partition that contains it, halting the consumer progress. What can we do with poison pills? There are many different strategies to deal with […]

By Javier

Kafka defaults that you should re-consider (I)

Image taken from http://www.htbdpodcast.com There is a vast number of configuration options for Apache Kafka, mostly because the product can be fine-tuned to perform in various scenarios (e.g., low latency, high throughput, durability). These defaults span across brokers, producers and consumers (plus other sidecar products like Connect or Streams). The guys at Kafka do their best to provide a comprehensive set of defaults that will just work, but some of them can be relatively dangerous if used blindly, as they might have unexpected side effects, or be optimized for a use case different to yours. […]

By Javier

Timeouts in Kafka clients and Kafka Streams

IMPORTANT: This is information is based on Kafka and Kafka Streams 1.0.0. Past or future versions may defer. As with any distributed system, Kafka relies on timeouts to detect failures. Those timeouts can be sent by clients and brokers that want to detect each other unavailability. The following is a description of the configuration values that control timeouts that both brokers and client will use to detect clients not being available. The original design for the Poll() method in the Java consumer tried to kill two birds with one stone: Guarantee consumer liveness Guarantee progress […]

Incompatible AVRO schema in Schema Registry

My company uses Apache Kafka as the spine for its next-generation architecture. Kafka is a distributed append-only log that can be used as a pub-sub mechanism. We use Kafka to publish events once business processes have completed successfully, allowing a high degree of decoupling between producers and consumers. These events are encoded using Avro schemas. Avro is a binary serialization format that enables a compact representation of data, much more than, for instance, JSON. Given the high volume of events we publish to kafka, using a compact format is critical. In combination with Avro we […]

By El Javi

By Daniele Zedda • 18 February

← PREV POST

By Daniele Zedda • 18 February

NEXT POST → 34
Share on