Quick Navigation
Kafka CLI CommandsProducer ConfigurationConsumer ConfigurationReplication and DurabilityExactly-Once Semantics and TransactionsKafka Streams — KStream, KTable, GlobalKTableKafka Streams — Windowing and StateSchema Registry and SerializationKafka ConnectCluster Architecture, KRaft, and Topic ConfigSecurity and Monitoring
Kafka CLI Commands
- kafka-topics.sh --bootstrap-server localhost:9092 --create --topic orders --partitions 6 --replication-factor 3
- Creates a topic with 6 partitions and replication factor 3.
- kafka-topics.sh --bootstrap-server localhost:9092 --list
- Lists all topics in the cluster.
- kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic orders
- Shows partition count, replication factor, leader, replicas, and ISR for a topic.
- kafka-console-producer.sh --bootstrap-server localhost:9092 --topic orders --property parse.key=true --property key.separator=:
- Produces messages from the console with an explicit key:value split on the separator.
- kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic orders --from-beginning
- Consumes messages from the earliest offset for ad hoc testing.
- kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group order-service
- Shows current offset, log-end-offset, and consumer lag per partition for a group.
- kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group order-service --reset-offsets --to-earliest --topic orders --execute
- Resets a consumer group's offsets to the earliest available offset; omit --execute to dry-run.
- kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type topics --entity-name orders --add-config retention.ms=604800000
- Dynamically alters a topic-level configuration without restarting brokers.
Producer Configuration
- acks
- Default is all (since Kafka 3.0) — requires acknowledgment from all in-sync replicas; other values are 0 (fire-and-forget) and 1 (leader only).
- enable.idempotence
- Default is true (since Kafka 3.0) — assigns a Producer ID and per-partition sequence number to eliminate duplicate writes on retry.
- linger.ms
- Default is 0 (send immediately); raising it batches more records per request, trading latency for throughput.
- batch.size
- Default is 16384 bytes — the maximum bytes of records batched together per partition before a send.
- compression.type
- Default is none; other options are gzip, snappy, lz4, and zstd — reduces network/disk usage at the cost of CPU.
- max.in.flight.requests.per.connection
- Default is 5 — the maximum unacknowledged requests per connection; must be <= 5 to preserve ordering with idempotence enabled.
- delivery.timeout.ms
- Default is 120000ms — the upper bound on time from send() to acknowledgment or failure, covering batching, retries, and the in-flight request.
- transactional.id
- A stable unique ID enabling transactions across producer restarts; required together with enable.idempotence=true for exactly-once writes.
Consumer Configuration
- enable.auto.commit
- Default is true — offsets are committed automatically every auto.commit.interval.ms (default 5000ms) on the next poll() call.
- auto.offset.reset
- Default is latest; earliest reads from the start of the topic; none throws an exception when no committed offset exists — applies only when there is no committed offset.
- session.timeout.ms
- Default is 45000ms — if no heartbeat is received within this window, the broker evicts the consumer and triggers a rebalance.
- max.poll.records
- Default is 500 — the maximum number of records returned in a single poll() call.
- max.poll.interval.ms
- Default is 300000ms (5 min) — the maximum time allowed between poll() calls before the consumer is considered failed and a rebalance is triggered.
- partition.assignment.strategy
- Default list is [RangeAssignor, CooperativeStickyAssignor]; CooperativeStickyAssignor enables incremental rebalancing without a stop-the-world pause.
- isolation.level
- Default is read_uncommitted; set to read_committed so the consumer only sees successfully committed transactional messages.
- group.instance.id
- Enables static group membership so a restarting consumer rejoins with the same identity, avoiding an unnecessary rebalance.
Replication and Durability
- Leader / Follower Replicas
- Every partition has one leader that handles all reads/writes and follower replicas that passively replicate the log.
- ISR (In-Sync Replicas)
- The set of replicas fully caught up with the leader within replica.lag.time.max.ms (default 30000ms); only ISR members are eligible for leader election by default.
- min.insync.replicas
- Default is 1 — the minimum ISR count required for a produce request with acks=all to succeed; set to 2+ for real durability with acks=all.
- unclean.leader.election.enable
- Default is false — prevents an out-of-sync replica from becoming leader, which would cause silent data loss.
- High Watermark
- The offset up to which all ISR replicas have replicated; consumers can only read up to the high watermark.
- Replication Factor
- The total number of copies of a partition across brokers (leader + followers); production standard is 3 for fault tolerance against one or two broker failures.
Exactly-Once Semantics and Transactions
- EOS Chain
- Exactly-once requires all three: idempotent producer (enable.idempotence=true), transactional producer (transactional.id set), and consumer with isolation.level=read_committed.
- producer.initTransactions()
- Registers the transactional.id with the transaction coordinator and fences off any previous zombie producer instance with the same ID.
- producer.beginTransaction() / producer.send(...) / producer.commitTransaction()
- Wraps one or more sends across multiple topics/partitions into a single atomic transaction; commitTransaction() makes all writes visible together.
- producer.abortTransaction()
- Rolls back all writes in the current transaction; consumers with read_committed will never see the aborted records.
- Idempotence Scope
- Idempotent producers deduplicate retries only within a single producer session; surviving a producer restart requires transactions with a stable transactional.id.
- transaction.state.log.replication.factor
- Default is 3 — replication factor of the internal __transaction_state topic that stores transaction metadata.
Kafka Streams — KStream, KTable, GlobalKTable
- KStream
- An unbounded stream of independent records (inserts); every event is treated as a new fact.
- KTable
- A changelog stream where each record updates the current value for its key; only the latest value per key is retained (like a materialized view).
- GlobalKTable
- Broadcasts the full dataset to every Streams instance; used for small reference/lookup data and does not require co-partitioning.
- KStream-KStream Join
- Requires a windowed join (JoinWindows) since both sides are unbounded; supports inner, left, and outer join variants.
- KStream-KTable Join
- A non-windowed lookup/enrichment join; requires co-partitioning (same partition count and partitioning strategy).
- KTable-KTable Join
- A non-windowed changelog merge join; requires co-partitioning for the primary-key join, but foreign-key joins do not.
- StreamsBuilder builder = new StreamsBuilder(); KStream<String, String> source = builder.stream("input-topic");
- Defines a processing topology by reading from a source topic into a KStream using the Streams DSL.
Kafka Streams — Windowing and State
- Tumbling Window
- Fixed-size, non-overlapping windows; each event belongs to exactly one window (e.g., count events per 5-minute block).
- Hopping Window
- Fixed-size, overlapping windows that advance by a hop interval smaller than the window size; a single event may fall in multiple windows.
- Sliding Window
- Windows are created based on the time difference between events, not fixed clock intervals; commonly used for join windows.
- Session Window
- Defined by an inactivity gap rather than a fixed duration; sessions merge when a new event arrives before the gap expires.
- State Stores
- Stateful operators (aggregate, reduce, count, join) persist local state in RocksDB by default, backed by a compacted internal changelog topic for fault tolerance.
- num.stream.threads
- Controls the number of stream processing threads per Streams instance; total application parallelism is capped by the source topic's partition count.
- TopologyTestDriver
- In-memory test harness for unit testing a Streams topology without a running Kafka cluster.
Schema Registry and Serialization
- BACKWARD
- Default compatibility mode — the new schema can read data written with the old schema; allows deleting fields and adding optional fields with defaults.
- FORWARD
- The old schema can read data written with the new schema; allows adding fields and deleting optional fields.
- FULL
- Combines BACKWARD and FORWARD — new and old schemas can both read each other's data; the strictest checked mode.
- NONE
- Disables compatibility checking entirely; any schema change is accepted.
- Confluent Wire Format
- Serialized Avro/Protobuf/JSON Schema messages are prefixed with a magic byte and a 4-byte schema ID that the deserializer uses to fetch the writer's schema from Schema Registry.
- TopicNameStrategy
- Default subject naming strategy — the subject is named <topic>-key or <topic>-value; alternatives are RecordNameStrategy and TopicRecordNameStrategy for multiple schemas per topic.
Kafka Connect
- Source Connector
- Reads data FROM an external system and writes it INTO Kafka (e.g., JDBC Source, Debezium CDC).
- Sink Connector
- Reads data FROM Kafka topics and writes it INTO an external system (e.g., JDBC Sink, S3 Sink).
- Standalone vs Distributed Mode
- Standalone runs a single worker with no fault tolerance (dev/test only); distributed runs a cluster of workers with automatic task rebalancing (production).
- Converters
- Serialize/deserialize record keys and values (JsonConverter, AvroConverter, StringConverter); configured independently for key.converter and value.converter.
- Single Message Transforms (SMTs)
- Lightweight per-record transforms such as InsertField, ReplaceField, MaskField, TimestampRouter, Cast, and RegexRouter; can be chained and are applied before Kafka write (source) or before sink write.
- errors.tolerance=all errors.deadletterqueue.topic.name=dlq-topic
- Routes records that fail conversion/transformation to a dead letter queue topic instead of stopping the connector.
- curl -X POST http://localhost:8083/connectors -H "Content-Type: application/json" -d @connector-config.json
- Registers a new connector on a distributed Connect worker via its REST API.
- tasks.max
- Sets the maximum number of parallel tasks a connector can create; actual task count is also bounded by the data source's parallelism (e.g., table count).
Cluster Architecture, KRaft, and Topic Config
- KRaft Mode
- ZooKeeper-free consensus mode using a Raft-based controller quorum for metadata management; ZooKeeper was fully removed starting with Kafka 4.0.
- process.roles=broker,controller
- Server config that designates a node's role(s) in KRaft mode; controllers manage cluster metadata, brokers serve client traffic.
- cleanup.policy
- delete (default) removes segments older than retention.ms/retention.bytes; compact retains only the latest value per key, using a null value (tombstone) to mark deletion.
- retention.ms
- Default is 604800000 (7 days) — how long records are retained before eligible for deletion under cleanup.policy=delete.
- Log Compaction Head/Tail
- The head holds newly written records (may contain duplicate keys); the tail holds already-compacted records with unique keys per compaction pass.
- Tiered Storage
- Splits data into a fast local tier and a cheaper remote object storage tier; not supported for compacted topics.
Security and Monitoring
- security.protocol
- PLAINTEXT (no security), SSL (encryption only), SASL_PLAINTEXT (auth only, unencrypted), SASL_SSL (auth + encryption, recommended for production).
- SASL Mechanisms
- PLAIN, SCRAM (challenge-response), GSSAPI (Kerberos), and OAUTHBEARER are the standard authentication mechanisms available for SASL.
- ACLs
- Authorization rules controlling which principals can perform which operations (Read, Write, Create, Describe) on which resources; disabled/open by default without an authorizer configured.
- Consumer Lag
- The difference between a partition's log-end-offset and the consumer group's committed offset; rising lag with stable producer throughput signals a slow consumer.
- Under-Replicated Partitions (URP)
- A non-zero count means one or more replicas have fallen out of the ISR; a critical broker health metric indicating durability risk.