Explain Consumer Offsets in Kafka.
Any message that produced into the topic will have the unique id called Offset.
The consumer has three options to read messages from the Topic:
--from-beginning— reading all the messages from the beginning.
--latest— will read only those messages that will come after the consumer spun up.
--offset — will read from a particular position and particular offset. This option can be done only programmatically.
--offset 6 reads the message in a topic by passing a specific value from the consumer.
- The first two options
--latestcan be explored using the consumer itself.
Let’s Deep dive into Consumer Offset.
- We have only 1 partition having some records in it. Let’s say we have a Consumer which is going to read the message from the beginning.
For any Kafka consumer, it is required the consumer provide the group id. Now the consumer in general pulls and retrieves multiple records at the same time.
→ As it processes each message it moves the consumer read offset one by one starting from 0,1,2,3…etc.
- Let say for some reason the consumer is crashed or shut down. While the consumer was down, the producer of the Topic produces some messages.
Question:- Again now the consumer is up after some time. So now how it knows that it needs to read from offset 4?
- The consumer offset in general are stored in the internal topic
- Consumer offset behaves like a bookmark for the consumer to start reading the messages from the point it left off.
Practical- Do we really have internal topics?
Please make sure that Zookeeper and Broker instances are running.
→ To check internal topic or consumer offsets we have a command, which is going to list all brokers or list of clusters available in your machine.
.\bin\windows\kafka-topics.bat --zookeeper localhost:2181 --list
__consumer_offsetsis the topic which was auto-created for us, which takes care of maintaining the consumer offsets for us.
test_topic is the topic which is created manually by us.