Your Redis cache just expired on a key that 8,000 users hit every second.
Every single one of those requests is now flying straight at your database.
This is the thundering herd. You didn't have a traffic problem — you had a cache problem. Now you have both.
Here's the setup:
Service → Node.js API, 8,000 req/sec on the /feed endpoint
Cache → Redis, TTL = 60s on the feed key
DB → Postgres, comfortable at ~200 req/sec sustained
What happened → TTL expired at peak traffic, all 8,000 req/sec hit Postgres simultaneously
The DB is on its knees. You have minutes before it falls over. And the next TTL expiry is in 60 seconds.
What do you do?
A) Mutex lock — only one request queries the DB to rebuild the cache, the rest wait behind it.
B) Probabilistic early expiry — start randomly rebuilding the cache before the TTL actually hits zero.
C) Request coalescing — collapse all in-flight requests for the same key into a single DB query, return the same result to all of them.
D) Cache pre-warming — a background job rebuilds the key on a schedule, TTL never reaches zero in prod.
All four ship in production systems. Only one of them prevents the thundering herd without introducing a new failure mode under load.
Pick one — A, B, C, or D — and tell me why. Full breakdown in the comments (including which answer is the senior-engineer trap that works in theory but falls apart when 8,000 requests are piling up).
If your team has ever had a cache expiry take down a database, share this with them. The debate is worth more than the post.
Drop your answer 👇













