Skip to main content
Engineering, Data / ML

From Batch to Streaming: Accelerating Data Freshness in Uber’s Data Lake

11 December / Global
Featured image for From Batch to Streaming: Accelerating Data Freshness in Uber’s Data Lake
Image
Figure 1: IngestionNext architecture. 
Image
Figure 2: Parquet file merging record by record. 
Image
Figure 3: Row-group merging with data masking. 
Image
Figure 4: Simplified row-group merging by groping schema. 
Image
Figure 5: Before/after streaming Ingestion 
Xinli Shang

Xinli Shang

Xinli Shang is the ex–Apache Parquet™ PMC Chair, a Presto® committer, and a member of Uber’s Open Source Committee. He leads several initiatives advancing data format innovation for storage efficiency, security, and performance. Xinli is passionate about open-source collaboration, scalable data infrastructure, and bridging the gap between research and real-world data platform engineering.

Peter Huang

Peter Huang

Peter Huang is the architect of Uber’s hybrid cloud streaming platform and an active committer for OpenLineage. He leads several key initiatives, including the Self-Serve Flink SQL platform and the evolution of Uber’s streaming ingestion infrastructure. His primary focus is on enabling business-critical use cases by designing highly reliable and scalable data processing systems built on Apache Flink.

Jing Li

Jing Li

Jing Li is a Senior Staff Engineer on the Data team at Uber. She has been working on multiple domains including data ingestion, data quality and open table format.

Jing Zhao

Jing Zhao

Jing Zhao is a Principal Engineer on the Data team at Uber. He is a committer and PMC member of Apache Hadoop and Apache Ratis.

Jack Song

Jack Song

Jack Song is an engineering leader specializing in large-scale Data and AI platforms. At Uber, he leads the Data Platform organization, building multi-cloud infrastructure, multi-modal data systems, and the agentic automation layer that powers Uber’s next-generation Data AI Agents.

Posted by Xinli Shang, Peter Huang, Jing Li, Jing Zhao, Jack Song