Commit log & Retention

Explain Commit log & Retention Policy in Kafka.

Commit Log

Continuous pull for new message and finally added to file system and log file.
  • It creates a file with an extension .log Each partition will have its own log i.e say if we have 4 partitions then we will have 4 log files. That is why it is called a Partition commit log.
  • After the message is written into the log file. That’s when the records that got produced are committed. So when the consumer who is continuously pulling for the new records can only see the records that are committed to the file system.
  • As new records are published to the Topics then the records get appended to the log file and the process continues. So this is all about commit log.
  1. Configuration of server.properties file
server.properties file
  • The test-topic when we created has four folders starting with 0,1,2 and 3, each folder represent partitions i.e test-topics-0, test-topics-1, test-topics-2 and test-topics-3 represents a partition.
  • You can see _consumer_offset-0 to _consumer_offset-49 . This is the topic that got created for us automatically for maintaining the consumer offsets and it has the value until 49. That means this consumer offset topic has 50 partitions.
  • All data written into this file will be converted to the file and then written into these files.
  • Similarly, you can navigate to the test-topics-1, test-topics-2 and test-topics-3 directories. So this is content that is available inside the Kafka partition log but there is much more information that is available as part of the log.
  • In order to view these, we have a command.
.\bin\windows\kafka-run-class.bat kafka.tools.DumpLogSegments --deep-iteration --files /c:/kafka/kafka-logs/test-topic-0/00000000000000000000.log

Retention Policy

  • Retention policy is the key policy that is going to determine how long the message is going to be retained.
  • It is configured using log.retension.hours in server.properties file.
  • The default retention period is 168 hours (7 days). This is the retention period that comes with Kafka by default. The actual matrix is hours.
server.properties file
  • If the log retention period is exceeded then it is going to delete the data from the log. i.e log.retention.check.interval.ms=300000
  • When this size is reached a new log segment will be created. i.e log.segment.bytes=1073741824

Kafka Tutorials

Next →

Previous ←

--

--

--

I am Full Stack Java Developer @ Tata Strive | Get blogs and tutorials related to the (React | Kafka | DevOps) | Follow me on LinkedIn https://www.linkedin.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Guilty Until Proven Innocent — Our Attitude Towards Testing Software

How to Make Backend for Website: A Beginner’s Guide — InvoZone

Unity2D Devlog 08 — Limiting Acceleration Boost

Going Serverless: how to run your first AWS Lambda function in the cloud

Blog Entry #1

Top Tips To Develop An Enterprise Mobile App

How to Control Your Lighting in Unity Using Light Layers

How to convert any Document File into PNG Array in PHP

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sagar Kudu

Sagar Kudu

I am Full Stack Java Developer @ Tata Strive | Get blogs and tutorials related to the (React | Kafka | DevOps) | Follow me on LinkedIn https://www.linkedin.com

More from Medium

Kafka DR Strategy Using MirrorMaker 2

Query Laning in Apache Druid

Kafka Change Replication Factor to 3

A real Kafka tutorial with replication & fault tolerence