Skip to main content
Engineering, Backend

How Uber Indexes Streaming Data with Pull-Based Ingestion in OpenSearch™

December 16 / Global
Featured image for How Uber Indexes Streaming Data with Pull-Based Ingestion in OpenSearch™
Image
Figure 1: Comparing how push-based and pull-based systems handle traffic spikes. 
Image
Figure 2: Streaming ingestion architecture.
Image
 Figure 3: Streaming ingestion data flow.
Image
 Figure 4: Shard recovery and replica promotion.
Image
 Figure 5: Multi-writer ingestion.
Image
Figure 6: Ingestion modes.
Image
Figure 7: Pull-based indexing model at Uber.
Yupeng Fu

Yupeng Fu

Yupeng Fu is a Principal Software Engineer on Uber’s SSD (Storage, Search, and Data) team, building scalable, reliable, and performant online data platforms. Yupeng is a maintainer of the OpenSearch project and a member of the OpenSearch Software Foundation TSC (Technical Steering Committee).

Varun Bharadwaj

Varun Bharadwaj

Varun Bharadwaj is a Software Engineer on Uber’s Search Platform team, building scalable and performant solutions powering Uber’s search capabilities. He’s an OpenSearch project maintainer and also a contributor of the pull-based ingestion feature in OpenSearch.

Shuyi Zhang

Shuyi Zhang

Shuyi Zhang is an Engineering Manager at Uber leading OpenSearch adoption and development at Uber and innovations in open source. She’s also a member of the Observability technical advisory group under the OpenSearch project.

Xu Xiong

Xu Xiong

Xu Xiong is a Software Engineer on Uber’s Search Platform team, focusing on building and scaling the search platform. He’s also a contributor to the pull-based ingestion feature in OpenSearch.

Michael Froh

Michael Froh

Michael Froh is a Software Engineer on Uber’s Search Platform team. He is an OpenSearch project maintainer and an Apache Lucene committer. He’s also a member of the OpenSearch Software Foundation TSC (Technical Steering Committee).

Posted by Yupeng Fu, Varun Bharadwaj, Shuyi Zhang, Xu Xiong, Michael Froh