Skip to main content

Stay up to date with the latest from Uber Engineering

Follow us on LinkedInFollow us on LinkedIn

Stay up to date with the latest from Uber Engineering

Follow us on LinkedInFollow us on LinkedIn
Featured image for Cinnamon Auto-Tuner: Adaptive Concurrency in the Wild
Image
Figure 1: The relationship between max concurrent requests and throughput. At some point the service can’t handle more and it drops fast.
Image
Figure 2: Architecture diagram of Cinnamon, with the scheduler and auto-tuner part highlighted.
Image
Figure 3:  Prioritized request scheduling
Image
Image
Figure 4: The lower the limit – the more tolerance to latency increase.
Image
Figure 5: The aggregation process from individual request timings to a smoothed value.
Image
Figure 6: The ever drifting targetLatency issue in effect. When resetting targetLatency the new targetLatency is captured at a higher limit, which leads to both of them drifting up. Note that at ~14:10 the overload stopped.
Image
Figure 7A: Positive covariance between the number of inflight requests and throughput.
Image
Figure 7B: Negative covariance between the number of inflight requests and throughput.
Image
Figure 8: The effect of covariance when resetting the inflight limit. Both inflight limit and the latency samples (i.e., request timing) are now stable.
Image
Figure 9: Overloading one node in production using Ballast
Image
Figure 10A: Throughput and latency during overload
Image
Figure 10B: Inflight limit (top); the ratio between the number of inflight requests
 and the inflight limit (middle); CPU usage (bottom)