JMS vs Kafka
What is Java Message Service (JMS)?
JMS is a specification for sending messages between applications. JMS allows systems to send and receive messages without knowing about each other. JMS creates reliable and asynchronous communication between applications.
What is a message?
A message is information. This information can be represented in various formats. Text, XML, JSON, and even a Java POJO are all valid formats for a message.
How does JMS work?
The overarching concept of JMS is simple. Application A sends messages to a destination queue. Application B reads messages from that queue.
A doesn't need to know about B. Likewise, if A is down, B can still read from the shared destination. This creates a loosely coupled relationship between A and B.
This is better than other peer to peer communications (TCP, CORBA, RMI) because A indirectly communicates with B. If one system goes down the other can continue reading/writing to a queue.
Furthermore, the nature of JMS as a specification allows for different implementations to be used interchangeably and increases interoperability.
It’s just a specification…
JMS specifies the interfaces for communicating between systems. It only specifies how components should interact. Similar to JPA vs Hibernate, it’s up to providers to actually implement the interfaces defined in JMS.
Some of the more popular JMS implementations include:
- SQS (Amazon)
- ActiveMQ (Apache)
- Weblogic Messaging (Oracle)
- Websphere MQ (IBM)
- RabbitMQ (Pivotal)
Notice how different companies have their own "flavor" of the JMS implementation. While there are differences between these options, they are all considered to be "Jakarta EE compliant".
What's Jakarta anyways?
Jakarta EE is the same as Java EE. The new name is the result of an ownership change where traditional Java EE technologies were moved under the Eclipse foundation.
For these reasons, Jakarta now refers to the same Java EE technologies.
JMS supports two communication models…
Point to Point Communication
Consumer applications listen for messages sent to a message queue. Producer applications send messages to the message queue. Only one consumer consumes the message from the queue.. A queue keeps messages until they are consumed (or expired).
Producer applications publish messages to a given topic. Consumer applications subscribe to a given topic. Any number of consumers can subscribe to a given topic.
Hmmmm that sounds kind of like Kafka…
What is Kafka?
Kafka is a distributed streaming platform. Using Kafka, applications can publish and subscribe to messages similar to JMS.
Kafka allows for both the reliable transfer and transformation of data. Applications use Kafka to process streams of data in a scalable, fault tolerant manner.
How does Kafka work?
Applications called producers publish messages to topics. Consumer applications subscribe to these topics. This creates a pub/sub mode similar to JMS.
Kafka topics are partitioned across a cluster of servers. Consumers are divided into consumer groups. Multiple consumer groups can read from the same topic.
Each consumer within a consumer group is responsible for reading from specific partitions. This guarantees the order of messages within a partition of a topic.
Having groups of consumers collectively consume from topics improves scalability and fault tolerance when consuming data.
This also decouples the consumption of messages from the production of messages. You can add as many consumers as you want to a Kafka cluster and it won't impact performance.
Want to start using Kafka? Check out this 5 minute Kafka tutorial.
JMS vs Kafka
Unlike JMS, Kafka messages can be consumed multiple times by multiple consumers. While JMS allows for multiple consumers subscribing to the same topic, once message are delivered they are gone. With Kafka, these consumers can reread messages as topics retain data as a persisted log of messages (for a configurable amount of time).
A key advantage to Kafka is scalability, especially when applied to the pub/sub model. You can scale the processing in JMS queues by adding more consumers but that means consumers compete to read a single message. And while you can broadcast the same message to multiple subscribers it’s difficult to scale using JMS. Each message is delivered to each subscriber.
Alternatively Kafka uses consumer groups to subscribe to a given topic. This allows for easier scaling as more consumer instances can be added to a group.
Kafka has higher throughput than other messaging systems. This can be partially attributed to the techniques Kafka uses to read/write messages regarding memory and disk usage.
But the major performance differences can be realized in how Kafka scales. Unlike JMS pub/sub model, you don’t have to add a new queue for each subscriber. Instead the burden of reading messages is placed on consumer groups and not the message broker themselves.
This makes it possible to add an infinite number of producers and consumers as Kafka leverages a distributed system to handle these processes in parallel.
Kafka solves several problems with traditional messaging. It scales the processing of messages in a pub/sub model where multiple consumers can read messages in parallel from a persisted log.
Kafka can do everything JMS can but JMS can’t do everything Kafka can.
For these reasons, Kafka is becoming more and more popular as an enterprise data streaming platform over more traditional messaging providers.
Be sure to check out When you should be using Kafka for a more detailed discussion.