Sr Data Engineer (Architect), Global Data Warehouse
About the Role
As a Data Engineer at Uber you will play a leading role in scaling the core data warehouse to power analytics for teams across Uber. You are a self-starter with demonstrable experience leveraging Hadoop, Spark and Presto at scale when building data solutions. Peers describe you as the go-to person for the most challenging data architecture, ingestion, processing and modeling problems. Detail-orientation, thoroughly tested code, high data integrity and great documentation are the hallmarks of your work but you excel equally well at explaining concepts in "big picture" terms to a less technical audience.
If this describes you and you tick off the boxes below, we would love to hear from you.
What You'll Do
- You will focus on data architecture, design, source data instrumentation, ETL pipeline optimization, and data model implementation.
- You will work extensively with HDFS, Hive, Presto, and Spark to build efficient and scalable solutions for end-users.
- You will also work on the HiveETL framework to implement and extend its functionality using Python such that all users of the framework benefit from it.
- You will identify limitations and required features in Data tools and partner with peer teams to design and implement them.
- You will automate manual work by developing scripts and utilities for repeated tasks.
- You will own big data problems, understand all underlying infra systems and platforms, and work towards improving resource requirements and customer SLA.
- You will partner with teams to ensure data is logged and validated correctly at the source systems like Kafka, Schemaless, MYSQL such that downstream analysis is consistent, accurate, and complete.
What You'll Need
- 7+ years expertise producing world class Software in a highly competitve environment.
- Proven experience in creating and evolving dimensional data models & schema designs to structure data for business-relevant analytics.
- Strong experience using SQL to build and deploy production-quality ETL pipelines.
- Hands-on experience using Hadoop, Hive, and Spark.
- 3+ years experience ingesting and transforming structured and unstructured data from internal and third-party sources into dimensional models.
- 2+ years experience writing and deploying Python, Scala, or Java code.
- Track record of partnering with product and engineering teams to deliver data products.
- Demonstrated ability to think strategically about business, product, and technical challenges and implement data solutions which scale to meet future needs.
- Experience developing utilities and tools to enable faster data consumption.
- Familiarity with Kimball's data warehouse lifecycle.
- Experience with real-time data ingestion and stream processing.
- 1+ years experience using Spark, Presto for data transformations.
About the Team
The Global Data Warehouse team builds and maintains a set of high-quality, analytics-optimized datasets derived from petabytes of raw data at Uber. We empower data scientists, operations teams and engineers with high quality canonical data models to make accurate business and operational decisions. We also build ETL frameworks and tools that enables data processing at Uber's scale.
At Uber, we ignite opportunity by setting the world in motion. We take on big problems to help drivers, riders, delivery partners, and eaters get moving in more than 600 cities around the world.
We welcome people from all backgrounds who seek the opportunity to help build a future where everyone and everything can move independently. If you have the curiosity, passion, and collaborative spirit, work with us, and let's move the world forward, together.
Uber is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, Veteran Status, or any other characteristic protected by law.