Skip to main content

Introducing QALM, Uber’s QoS Load Management Framework

22 March 2018 / Global
Featured image for Introducing QALM, Uber’s QoS Load Management Framework
QALM diagram
Figure 1: In QALM, our overload detector calculates request latency in the buffer queue to detect overload.
QALM architecture
Figure 2: QALM isolates endpoints, such that if EP1 suffers degradation, EP2 still works as normal.
Figure 3: Service owners can use the QALM UI to update the criticality for endpoint-caller pairs.
QALM P99 request
Figure 4: P99 Request latency suffers non-linear increase to ~20 seconds without QALM integration.
QALM latency graph
Figure 5: QALM integration improved the success request latency p99 ~400 milliseconds during the overload period.
QALM critical graph
Figure 6: QALM correctly identified non-critical requests from document-staging, shedding ~50 percent of them during the overload period.
QALM RPC architecture
Figure 7: This graph shows QALM inbound middleware plugged into the YARPC layer, so service handlers do not need to make changes to their code.
QALM cpu
Figure 8: This graph shows QALM low CPU usage of ~3 percent overhead.
Uber Visa graph without QALM
Figure 9: Without QALM, all apply requests began to timeout after getProvidedCards reached 1,200 RPS.
QALM Uber Visa getProvidedCard graph
Figure 10: With QALM enabled, apply can still serve requests even as getProvidedCard reached 1,800 RPS. Only two requests timed out during 30 minutes of load testing.
QALM Uber Visa graph
Figure 11: With QALM providing endpoint level isolation for Uber Visa, we saw a 50 percent increase in reliability tolerance for traffic spikes.
Keep QALM and code on
Our motto for the QALM project.

Posted by Uber