The Observability team builds the tools and systems that every engineering team at Uber uses to develop, scale, understand, and monitor their systems. These tools are absolutely critical to Uber – without them it would be impossible to understand and debug problems in an environment with over three thousand microservices, hundreds of thousands of CPU cores in multiple DCs and the cloud, and hundreds of thousands of concurrent trips around the world.
The Observability suite includes:
- Jaeger, our open source enterprise Golang tracing system. Provides actionable insight into individual flows through our microservice architecture, and comprehension of the entirety of Uber’s software ecosystem.
- M3, our open source enterprise metrics stack. It handles hundreds of millions of emitted metrics per second, used to monitor and alert for every product and microservice at Uber.
- Synoptic, our Uber-aware dashboarding system which displays context-sensitive information from across the Uber ecosystem, enabling quick detection and mitigation of issues.
- Our deeply integrated On-Call Experience suite of tools, which gives on-call engineers everything they need to raise, track, and close outages incidents, to track the SNR of alerts, and to drive improvements in their team health by reducing alert load.
- Blackbox, our system for externally monitoring our critical business endpoints, via emulated workflows.
- A new system under development to provide enterprise logging, with deep integration into our Observability stack, including alerting, linkage to traces, etc.
Jaeger is Uber’s open-source Distributed Tracing system, designed to provide real time performance monitoring and profiling for distributed architectures. Inspired by Google’s Dapper and OpenZipkin, Jaeger is a complete redesign based on the new OpenTracing standard. Since its first deployment in production about a year ago over 600 microservices have been integrated with Jaeger, with many hundreds more to come.
The project has recently been open sourced and the team is working with other major tech companies to make this the leading tracing project for large scale distributed tracing systems worldwide. Check out our Distributed Tracing blog post: https://eng.uber.com/distributed-tracing/