What is Kafka Consumer Group ID?

Yuzo_Koyama /

Your thoughts?

EatFreshRupesh /

When people talk about a "Kafka consumer" they can mean different things...and it leads to some confusion.

A Kafka consumer is technically a process that is part of a larger group. This collective group of consumers is called a "consumer group".

It is the collective responsibility of a consumer group to process messages from a given topic. Each consumer within the group ideally will read from one partition. Kafka balances the number of partitions across the number of available consumers in the group.

The groupId associates a Kafka consumer with a consumer group....

If a topic has 3 partitions and you have 2 consumers operating within the same consumer group, one of the consumers will read from 2 partitions and the other will read from 1.

If a topic has 4 partitions and you have 2 consumers operating within the same group then both consumers will read from 2 partitions.

If a topic has 1 partition and you have 2 consumers then 1 consumer reads from 1 partition and the other just sits there...

theRealJS /

The group.id is how you distinguish different consumer groups. Remember that consumers work together in groups to read data from a particular topic.

Understanding group.id is fundamental to achieving maximum parallelism in Kafka. Remember that the number of partitions for a given topic will be balanced across the available consumers in the group.

blazingSaddles /

The group ID is very important to how different consumers "load balance" partitions. For example, if you have a topic with 10 partitions then two consumers with the same groupId will read from 5 partitions each.

If you have two consumers with different group ids, both consumers will read from 10 partitions.

In this sense, the groupId is how you define a "consumer group" or group of consumers reading from a given topic/partitions.