How Uber Serves over 150 Million Reads per Second from Integrated Cache with Stronger Consistency Guarantees
26 August / Global
Introduction
This is the second blog about the integrated cache infrastructure for Docstore, our online storage at Uber. In the previous blog, we presented CacheFront and outlined its core functionality, design principles, and architecture. In this blog, we share some exciting improvements we’ve implemented since then as we scaled up its footprint by almost 4 times.
CacheFront Recap
The previous blog has a detailed explanation of the overall Docstore architecture. For this blog, it’s sufficient to remember that CacheFront is implemented in our stateless query engine layer, which communicates with the stateful storage engine nodes to serve all read and write requests.

Reads
The reads are intercepted at the query engine layer to first try fetching the rows from Redis™. The rows that weren’t found in Redis are fetched directly from the storage engine and subsequently written to the cache to keep the cache warm. The read results are then merged and streamed back to the client.
Writes
Docstore generally supports two types of updates:
- Point writes: INSERT, UPDATE, DELETE queries that update a predefined set of rows, specified as the arguments of the DML query itself.
- Conditional updates: UPDATE or DELETE queries with a WHERE filter clause, where one or more rows can be updated based on the condition in the filter.
The writes aren’t intercepted at all, mainly because of the presence of conditional updates. Without knowing which rows are going to change or have changed as a result of running the DML query, we couldn’t know which cache entries require invalidation.
Waiting for rows to expire from Redis due to their time-to-live (TTL) value wasn’t good enough. So, we also rely on Flux, our change data capture (CDC) stateless service that, among its other responsibilities, tails MySQL binlogs to see which rows have been updated and asynchronously invalidates or populates the Redis cache, usually within a subsecond delay. When invalidating, instead of deleting from Redis, Flux writes special invalidation markers to replace whatever entries are currently cached.
Challenges
As the scale of our CacheFront deployment increased, it became clear to us that there was a growing appetite for a higher cache hit rate and stronger consistency guarantees. The eventually consistent nature of using TTL and CDC for cache invalidations became a blocker for adoption in some cases. Additionally, to make caching more effective, engineers were tempted to increase the TTL for extending the lifespan of the rows in the cache. This, in return, resulted in a higher amount of stale/inconsistent values served from cache, and the service owners weren’t happy about it.
These challenges were as old as the idea of caching itself. To reiterate from the previous blog: