Stream Processing ArchitecturesKafka Streams ArchitectureMedium⏱️ ~3 min

Kafka Streams Threading and Task Model

The Architecture Deep Dive: Each Kafka Streams application instance creates a fixed number of stream threads. Each thread runs an independent event loop: poll records from assigned Kafka partitions, process them through the topology, update local state stores, produce output records, then repeat. There is zero shared state between threads within the same instance, eliminating cross thread locking in your processing logic. The parallelism math matters. If you have 96 input partitions and deploy 8 application instances with 3 threads each, that's 24 total threads. Each thread handles 4 tasks (96 tasks divided by 24 threads). Each task owns specific partitions and their associated state stores. How Task Assignment Actually Works: Kafka Streams leverages the standard Kafka consumer group protocol. When instances join or leave, the group coordinator triggers a rebalance. Partitions get reassigned, and because tasks are bound to partitions, tasks move automatically. This is how scaling works: add more instances, and tasks redistribute. Remove instances, and surviving instances pick up the orphaned tasks. The assignment is sticky by default, meaning tasks prefer to stay on the same instance to avoid unnecessary state migration. But if an instance crashes, its tasks must move. The new owner must then restore state from the changelog topic before processing can resume.
Throughput at Scale
1-2M
RECORDS/SEC
5-20ms
PER-RECORD LATENCY
State Store Locality: Each task owns its state stores. Reads and writes are synchronous operations within the processing loop, so they must be fast. This is why Kafka Streams uses embedded key value stores backed by disk with in memory caching. There are no network calls to fetch state during processing. But this creates a critical constraint: stateful operations must be partition aligned. If you need to join streams on a different key than how topics are partitioned, Kafka Streams automatically inserts a repartition topic. Your data gets written to this intermediate topic partitioned by the new key, then read back by tasks responsible for those partitions. Repartitioning works but adds latency and doubles the Kafka traffic for that stage of your topology.
⚠️ Common Pitfall: Too many repartitions can kill performance. Each one adds serialization overhead, network transfer, and disk writes. Design your input topics and keys carefully to minimize repartitioning stages.
Scaling in Practice: To handle 10x traffic, you increase input topic partitions and add more application instances. Kafka Streams redistributes tasks automatically. The ceiling is one task per thread, so with 96 partitions you can scale out to 96 threads maximum (distributed across however many instances you want). Beyond that, you need to repartition your input topics to create more parallelism.
💡 Key Takeaways
Stream threads are independent event loops with no shared state, eliminating locking and simplifying concurrency
Tasks are bound to partitions one to one, making scaling straightforward: more partitions equals more parallelism
State stores are task local and synchronous, avoiding network calls during processing but requiring partition aligned operations
Repartitioning is automatic when joining on new keys but doubles Kafka traffic and adds latency for that topology stage
📌 Examples
1With 96 partitions and 8 instances running 3 threads each, you get 24 total threads handling 4 tasks per thread, achieving 1 to 2 million records per second throughput
2If your topology joins streams on <code style="padding: 2px 6px; background: #f5f5f5; border: 1px solid #ddd; border-radius: 3px; font-family: monospace; font-size: 0.9em;">user_id</code> but input topics are partitioned by <code style="padding: 2px 6px; background: #f5f5f5; border: 1px solid #ddd; border-radius: 3px; font-family: monospace; font-size: 0.9em;">transaction_id</code>, Kafka Streams inserts a repartition topic to reshuffle data by <code style="padding: 2px 6px; background: #f5f5f5; border: 1px solid #ddd; border-radius: 3px; font-family: monospace; font-size: 0.9em;">user_id</code> before the join
← Back to Kafka Streams Architecture Overview
Kafka Streams Threading and Task Model | Kafka Streams Architecture - System Overflow