Staff Software Engineer
Software Engineer
Engineering Manager
Introduction
Search underpins nearly every real-time experience at Uber: from matching riders and drivers, to fraud detection, to powering recommendations on Uber Eats. As these systems grew in traffic, payload size, and latency sensitivity, we hit a familiar industry limit: REST/JSON was no longer just slower—it was constraining how our search platform could evolve.
Internally, much of Uber’s infrastructure already communicates using gRPC™ and Protobuf. These systems rely on strongly typed contracts, efficient binary serialization, and streaming-friendly transports. But OpenSearch, which Uber had standardized as the foundation for search and retrieval, historically exposed only REST/JSON APIs. Bridging these two worlds required translation layers that added latency, complexity, and operational risk.
Rather than treating this mismatch as an integration problem, we approached it as an opportunity to advance Uber’s Search Platform and OpenSearch itself.
In this post, we’ll walk through why we introduced native gRPC endpoints in OpenSearch, how we designed them to coexist with REST, and what we learned after deploying them in production at Uber. We’ll also share the performance gains we observed, particularly for ingestion and vector search workloads, and how this work contributes back to the OpenSearch open-source ecosystem.
Architecture
We designed and implemented native gRPC support directly in OpenSearch to avoid maintaining long-term forks or translation layers. This approach ensured that the solution would benefit from Uber’s production workloads and the broader OpenSearch community.
Below, we outline the resulting architecture and how gRPC and REST coexist as first-class transports within OpenSearch. This design intentionally preserves REST for compatibility and operational simplicity, while enabling gRPC for performance-critical and high-throughput workloads.
gRPC Server
The core component of the native gRPC transport is the server implementation in OpenSearch.
To offer flexibility, the gRPC transport is offered as a module. We enable the module alongside the REST transport by running on different sets of ports, allowing both REST and gRPC requests. Figure 1 shows how only the client-server layer differs between the REST and gRPC transports, while the internal node-to-node logic remains shared.
To offer extensibility with other plugins and gRPC services, the gRPC transport publishes an SPI (service provider interface) to allow other query types implemented outside of core (for example, the k-NN plugin) to extend this interface to provide their own Proto-to-POJO conversion utility methods. Plugins are also able to implement and expose their own gRPC services and hook them into the main transport-gRPC module.
API Coverage
For most OpenSearch customers, including those at Uber, the most latency-sensitive APIs are search and ingestion. We prioritized these APIs first: Search and Bulk , where gRPC’s performance characteristics provide the clearest benefit.
JSON to Protobuf Conversion
A gRPC transport is only as strong as its data model. To support OpenSearch natively over gRPC while keeping the JSON and Protobuf APIs in sync, we established conversion rules and built automation tooling for spec-to-Proto conversion.
The conversion of the JSON-based API specification to Protobuf looks something like what’s shown in Figure 2.
The end result of this automation is a reliable, repeatable mechanism for maintaining long-term API parity and backward compatibility between OpenSearch’s REST and gRPC interfaces.
To make gRPC viable as a long-term, first-class transport, we needed a way to evolve Protobuf and REST APIs together without manual drift or compatibility risk. In practice, this required more than straightforward code generation: REST and Protobuf have fundamentally different semantics and evolution constraints.
To address this, we built an end-to-end automated pipeline with three stages: preprocessing, core conversion, and postprocessing.
During preprocessing, we resolve semantic mismatches between REST APIs and Protobuf schemas. REST specifications often rely on REST-specific conventions—such as method-and-path semantics, query parameters, and status-code-driven behavior—while Protobuf requires explicit, strongly typed request/response messages. We normalize the OpenAPI specification and apply a set of conversion rules that make these implicit REST behaviors explicit and safe to generate into Protobuf.
Next, the core conversion step translates the preprocessed OpenAPI specification into Protobuf artifacts. While OpenAPI Generator provides mature support for generating clients and servers in many JSON-based formats, it didn’t have equivalent support for robust JSON-to-Protobuf conversion. To address this gap, we derived and contributed a set of conversion rules to OpenAPI Generator, ensuring that gRPC APIs stay structurally aligned with their REST counterparts as the API surface evolves.
Finally, postprocessing enforces long-term wire compatibility. Unlike REST, Protobuf APIs can’t tolerate changes such as field renumbering without breaking existing clients. The pipeline performs compatibility checks against previously generated Protobufs and applies safeguards to ensure new changes don’t invalidate older clients.
By codifying these rules into an automated workflow, we ensured that gRPC could remain a trustworthy, first-class API surface as OpenSearch continues to evolve.
Uber Integration Example: Search Gateway
At Uber, we integrated the Search and Bulk gRPC APIs across our tech stack across multiple components of our search stack.
One example is the OpenSearch Gateway, a service that proxies all OpenSearch customer search and ingest requests at Uber, before sending to an OpenSearch cluster. The purpose is to offer additional security, observability, rate-limiting, and auditing.
With the introduction of gRPC, we could resolve tech debt and improve performance by removing the adaptor layer, and directly using a pass-through method to send the client’s protobufs to the OS gRPC endpoint.
Before native gRPC was supported, Search Gateway under the hood had to use an in-house adaptor to transpile Protobuf requests to JSON to send to OS REST endpoints and the JSON response back to Protobuf. This added both latency and overhead to each customer’s OpenSearch requests.
Performance Impact
One of the key drivers of gRPC adoption was for its performance benefits. With gRPC, we realized many latency and throughput gains at Uber, as compared to REST.
Bulk
M3 is Uber’s in-house metrics system. With gRPC, M3 saw a roughly 60% p99 write latency reduction in production (from 34.1ms to 13.6ms).
Similarly, p50 also saw a notable reduction of around34% (from 15.8ms to 10.5 ms).
Another place we found performance gains was with the M3 Indexer. The Indexer is responsible for ingesting millions of data points per second into M3. A key metric for M3 is the maximum indexing delay: the amount of time for a metric to be indexed into M3. This is a top-line business-impacting metric, especially critical during failovers. With gRPC, the max indexing delay was reduced for the M3 short index by 20-35%.
Additionally, Uber OpenSearch customers run Apache Spark™ jobs to batch index their data into their OpenSearch clusters. Migrating these batch ingestion jobs to use the gRPC Bulk API under the hood yielded a 20-35% reduction in job runtime.
Search
Alongside bulk, we found that the greatest gains were realized for vector search queries. These types of search requests include a large vector on the request-side. Vectors are very inefficiently serialized in JSON, as opposed to using a packed encoding format for a repeated float type in Protobuf.
Delivery shopping lists, powering recommendations for grocery stores at Uber Eats, saw a roughly 53% p50 search latency reduction (from 83ms to 38ms), and a roughly 43% p95 reduction (from 114ms to 64ms). p99 saw a roughly 14% reduction (from 205ms to 176ms), due to long-tail large queries.
Benefits for KNN search are even greater with larger vector dimensions (with 1,572 dimensions showing larger latency reductions than 512 and 256 dimensions).
Another place where we saw performance gains was with documents represented with binary formats. OpenSearch REST users primarily use the JSON format for document representation. With gRPC, it was easy to encode documents into various formats, such as CBOR or SMILE (a binary representation of JSON).
In particular, gRPC SMILE search was:
- 30% faster than REST JSON
- 45% faster than gRPC JSON
- 47% faster than REST SMILE
Performance Summary
From our performance analysis, we concluded that gRPC offers better performance for:
- Workloads with large request sizes
- Higher throughput at larger RPS
- Documents represented with binary document formats
Conclusion
Introducing native gRPC endpoints to OpenSearch allowed us to close a major gap between Uber’s internal service ecosystem and its next-generation search platform.
By aligning OpenSearch with the same Protobuf-based contracts used across Uber, we eliminated translation layers, simplified integrations, and unlocked meaningful performance gains—especially for high-throughput ingestion and vector-heavy search workloads. Just as importantly, we did this while preserving REST compatibility, enabling teams to migrate incrementally rather than all at once.
One of our biggest takeaways is that API representation is not a surface-level choice. At scale, it shapes system evolution, performance ceilings, and developer velocity. As workloads increasingly involve large payloads, streaming, and machine learning-driven queries, these considerations become even more critical.
Beyond Uber, this work extends OpenSearch with a first-class gRPC transport that benefits the broader community. Native gRPC support enables new usage patterns, stronger typing, and better performance characteristics for anyone building latency-sensitive search systems.
Next Steps
Looking ahead, we plan to expand gRPC API coverage, deepen streaming support, and continue improving security and observability in the gRPC transport. We invite engineers to explore these capabilities, try them in their own systems, and contribute to the OpenSearch ecosystem as it continues to evolve.
Acknowledgments
We’d like to give a huge thanks to the mentorship and guidance from Uber leadership (Yupeng Fu, Shubham Guptas) and the AWS® engineers who partnered with us to build the transport, query converters, and Protobuf tooling (Andrew Ross, Finnegan Carroll, Peter Zhu, Saurabh Singh). We also appreciate the SIA, LucenePlus, and customer teams at Uber who partnered on migrations and testing.
Cover Photo Attribution: “Light trails @ night” by mostaque is licensed under CC BY 2.0.
Apache Lucene is a trademark of the Apache Software Foundation.
AWS® and the Powered by AWS logo are trademarks of Amazon.com, Inc. or its affiliates.
gRPC is a trademark of The Linux Foundation.
OpenSearch™ is a trademark of LF Projects, LLC.
Stay up to date with the latest from Uber Engineering—follow us on LinkedIn for our newest blog posts and insights.
Karen Xu
Staff Software Engineer
Builds scalable search systems; drives gRPC/Protobuf adoption and maintains OpenSearch projects.
Xi Lu
Software Engineer
Works on Uber Search Platform; maintains OpenSearch protobufs and API specification.
Sam Akrah
Software Engineer
Focuses on OpenSearch gRPC performance, integrations, and platform adoption at Uber.
Shuyi Zhang
Engineering Manager
Leads OpenSearch adoption and innovation at Uber; member of OpenSearch Observability TAG.
Michael Froh
Software Engineer
OpenSearch maintainer, Lucene committer, and TSC member driving Uber Search Platform.
المنتجات
شركة