Design Fundamentals • Scalability FundamentalsMedium⏱️ ~3 min
Stateless Architecture: The Foundation of Horizontal Scaling
Why Stateless Services Scale:
A stateless service stores no client data between requests. Every request contains all information needed to process it. This single property unlocks horizontal scaling because any server can handle any request. Add 10 more servers, and they immediately start handling traffic. No synchronization, no shared memory, no coordination.
Contrast this with stateful services where user sessions live in server memory. If Server A holds your shopping cart, all your requests must route to Server A. Lose that server, lose the cart. Add new servers, and they sit idle until they build up their own sessions. You have created an accidental database inside your application tier.
How to Externalize State:
Move state out of application servers into dedicated stores:
First, sessions go to Redis or Memcached. A 3 node Redis cluster handles 100,000 to 300,000 operations per second with sub millisecond latency. Your session lookup adds maybe 0.5ms to each request but enables unlimited horizontal scaling of application servers.
Second, file uploads go to object storage like Amazon S3 or Google Cloud Storage. S3 handles unlimited concurrent uploads with 99.99% availability. Never store uploaded files on local disk where they become stranded on one server.
Third, background job state goes to message queues like Amazon Simple Queue Service (SQS) or Redis. Workers pull jobs from the queue, process them, and acknowledge completion. Any worker can handle any job.
⚠️ Sticky Sessions Are a Code Smell: If you need load balancer sticky sessions to route users to specific servers, your service is accidentally stateful. Fix the root cause by externalizing state rather than working around it.
The Stateless API Pattern:
Consider a ride sharing API server. A request arrives: "Get available drivers near location X for user Y." The server receives user ID, location, and auth token in the request itself. It fetches user preferences from Redis, queries the driver database, computes matches, and returns results. No local state. Deploy 50 API servers behind a load balancer and they immediately share traffic. One server crashes mid request? The client retries and another server handles it identically. This is why every major platform runs stateless API tiers in front of stateful data stores.💡 Key Takeaways
✓Stateless services store no client data between requests, allowing any server to handle any request and enabling linear horizontal scaling without coordination overhead
✓Externalize sessions to Redis (100K to 300K ops per second, sub millisecond latency), file uploads to S3 (unlimited throughput, 99.99% availability), and job state to message queues
✓Sticky sessions indicate accidental statefulness; they prevent true horizontal scaling and create single points of failure when the pinned server fails
✓JWTs enable stateless authentication by encoding user identity in signed tokens that any server can verify without querying a central session store
✓The latency cost of external state (0.5 to 2ms per Redis lookup) is negligible compared to the scaling flexibility gained by making services stateless
📌 Examples
1Netflix API servers are completely stateless; user profiles, watch history, and recommendations are fetched from backend services on each request, allowing thousands of API servers to scale independently
2Shopify checkout stores cart state in Redis with 100ms Time To Live (TTL) refresh; if any checkout server fails, users continue seamlessly on another server without losing their cart
3Uber trip state lives in a distributed database, not driver app servers; when a driver phone reconnects after network loss, it can hit any server and resume the trip from persistent state