Kafka for Rubyists - Part 2
This is part 2 of my post describing Kafka and how you can use it in your Ruby projects. Check out part 1 here.
In part 1 we went over the basics of Kafka and how it operates as well as it’s basic architecture. In this post I am going to cover why you may want to use Kafka for your next project by describing some of it’s benefits such as durability and fault tolerance. Finally, I want to wrap up with some popular Ruby Kafka clients and why you would choose Kafka over other popular message brokers.
Before I get into some of the main benefits of using Kakfa for a distributed messaging system, I want to wrap up the final main component of Kafka’s design. In part 1 I described the majority of Kafka components/terminology including producers, consumers, topics and partitions. The final high level abstraction when describing Kafka’s architecture is the consumer group.
Normally, messaging systems operate in one of two models: queuing and pub-sub. In a queuing system, messages are placed in a queue and one consumer out of a group consumes a single message from the queue. Pub-sub systems operate by broadcasting each message to all consumers and a consumer can choose whether or not to actually consume the message.
Kafka allows for both use cases by abstracting consumers into what is known as a consumer group. When a consumer is created it must label itself as part of a consumer group. Each message that is published to a topic is delivered to one consumer instance in each consumer group that is subscribed. This allows for the developer to choose if their cluster will act as a traditional messaging queue or as a pub-sub system by deciding how to allocate consumers in consumer groups. If all consumers are placed into a single subscribing consumer group, then a message will only go to a single consumer, just as in a queuing system.
Topics 0 and 1 each publish messages to a single Consumer Group (Queue)
Alternatively, if consumers are placed in several subscribing consumer groups, then a message will go to one consumer in each group, much like traditional pub-sub.
Topic 1 publishes messages to multiple Consumer Groups (Pub-Sub)
Kafka can also guarantee the order of messages delivered as well as support load balancing in the cluster by leveraging the fact that topics are actually broken up into multiple partitions (described in part 1). Each partition is consumed by exactly one consumer in each consumer group. This ensures that the consumer is the only reader of that partition within the group and allows for in order delivery of messages.
Durability and Fault Tolerance
Outside of in-order message guarantees, Kafka also provides two highly sought after features of distributed design: Durability and Fault Tolerance.
Durability simply means that when a message is published that it cannot be lost. There are several message guarantee types including:
- At most once - A message may be lost but is never re-delivered
- At least once - A message can never be lost, but may be delivered multiple times
- Exactly once - A message is delivered once and only once
It might not be obvious, but most systems that require distributed messaging would be easier to develop and reason about if message delivery was guaranteed exactly once. In this case there is no possibility of duplicate messages and message are guaranteed to never be lost.
We must also realize that durability guarantees exist on both the producer and consumer sides of a distributed system. For a message to be published in Kafka, that message must be said to be committed. A committed message is guaranteed to not be lost as long as the Kafka broker responsible for replicating the partition is ‘alive’. The Kafka docs describe in detail what ‘alive’ actually means. Messages are not handed out to consumers until they have been marked as committed, which means that consumers do not have to worry about a message being lost once it is consumed.
Kafka was designed with replication in mind. This means that Kafka allows for replicating (or copying) each topic’s partitions across a number of servers. This feature allows Kafka to provide configurable message durability for producers as producers can decide whether or not they want to wait for a message to be replicated to each node before being marked as committed.
Fault tolerance of messaging systems is another important feature to consider when choosing one. Fault tolerance is the ability for a system to continue to function in the event of a failure of one for more of it’s components. In distributed environments servers die, hard drives fail and networks go down. As the complexity and number of components of your system increases, so does the probability of failures.
Just as with message durability, Kafka was designed with fault tolerance in mind from the beginning. Unlike many messaging systems which buffer messages in memory before flushing to disk, Kafka writes each message to disk before it is considered published. This is possible while still maintaining performance because messages are written in sequential order instead of random access. This means that if a Kafka node were to crash or suddenly become unavailable, that the last message that was published before the crash will still be available on disk when the node comes back up. This is also made possible by the fact that Kafka consumers are responsible for keeping track of the last message offset that they consumed. When a crashed node containing a partition comes back online, consumers are able to pick up right where they left off as long as the message was successfully written to disk before the node crashed.
As previously mentioned, Kafka partitions are replicated throughout the cluster so that each server handles data for subset of partitions. Kafka marks one server that contains a partition as the ‘leader’ and the other servers that contain the replica partitions as ‘followers’. The leader is responsible for all write and read requests for that partition and the followers are responsible for replicating the leader. If a leader node goes down, a follower node is elected to become the new leader.
I am glossing over much of how Kafka handles replication, leader election and other important details for the sake of brevity. The Kafka docs are great and cover all of this in much greater detail than I could possibly cover here, so please check them out.
Now that we have covered Kafka in some detail, I’d like to go over the main Kafka clients that allow for easy integration between your Ruby code and a Kafka broker.
Poseidon is probably the most popular Kafka Ruby client and implements many of the concepts that we have previously discussed such as producing and consuming messages to and from a Kafka topic. Poseidon also supports message compression by leveraging the snappy gem if installed.
While the Poseidon gem is very popular, one of the main things that it is lacking is support for consumer groups. Depending on your use case, this may be a deal breaker. If so, Poseidon Cluster is here to help. Poseidon Cluster adds support for consumer groups by extending the Poseidon gem itself.
There is also the kafka-rb gem, however according to RubyGems.org it is not nearly as popular as Poseidon (77k downloads vs 475k).
Communicating with Kafka directly via Ruby is not your only option when integrating Kafka into your system. The Confluent Platform is a redistribution of Apache Kafka, along with some new technologies built on top. One of these technologies is the Kafka REST Proxy which provides a RESTful interface on top of Kafka. This means that you can produce, consume and interact with Kafka directly via HTTP using any Ruby HTTP client such as rest-client, faraday, HTTParty or even Net::HTTP in the Ruby standard library.
Ok, so hopefully I have convinced you to give Kafka a shot when you need to use a message broker in your Ruby services. As mentioned in part 1, Kafka can be used as much more than just a messaging system. But what if all you want to do is connect separate services in your system? Why would you choose Kafka? There are many other popular messaging systems such as RabbitMQ, ZeroMQ and NSQ just to name a few.. so what makes Kafka different?
Well there are actually multiple resources that describe the differences between these technologies so I won’t reproduce that information here. But I will say that if you want a flexible, scalable, performant, and battle tested technology to serve as the backbone of your messaging infrastructure, Kakfa deserves a hard look.
As always, please email, tweet, or comment below to let me know any questions or comments you may have.