跳至主要内容
优步

将您所在的位置提供给我们

请输入您附近的城市名称,以帮助我们显示您所在区域的正确信息

选择您的语言

Software Engineer - Data Modeling Platform (Product Platform)

San Francisco, CA 的 Data, Engineering

Uber’s businesses generate millions of mobile events every minute. There are billions of GPS points processed through our platform daily, as well as millions of trips. The Global Data Warehouse team’s mission is to model the core parts of the business. Data scientists, Product Managers, and Operations Teams around the world consume and analyze our data. Our global data warehouse processes hundreds of thousands of queries on a weekly basis.  We maintain hundreds of ETL pipelines to deliver data on core business entities in a timely, accurate, and complete manner.


The Global Data Warehouse Team (GDW) powers analytics for many of Uber’s businesses.  Want to know how many users joined Uber as Riders and subsequently decided to become Drivers on our platform?  The Global Data Warehouse team maintains the data objects which answer this question. Need to analyze how the wait times shown in the Rider app correlate with Rider and Driver ratings?  We have the data at the ready. We model tables and build data pipelines for the core of our business including Driver, Rider and Trip analytics. We collaborate with teams including Eats, Fraud, Ops, Finance, and Marketing to support domain specific needs. We ingest truly massive volumes of data generated from our globally distributed users and structure this data in an analytics-friendly way while guaranteeing highest fidelity of historical data and low latency - questions at Uber don’t wait for an answer for a very long time.



As a Senior Software Engineer in Data 1 at Uber you will play a leading role in scaling the global data warehouse to power analytics for teams across Uber.  You are a self-starter with extensive industrial experience in SQL, Data Modeling, and ETL pipeline design. You have deep experience implementing ETL pipelines in Hive or another MPP database architecture.  You are comfortable with Spark and Presto having used one or both frequently to process very large volumes of data. You possess at least a working knowledge on a platform for streaming analytics. You are comfortable coding in Python, Java, or Scala.  You have demonstrated strong competency in reliably operating 100s of ETL pipelines with adherence to strict SLAs and quickly root-causing and correcting complex data problems. Peers describe you as the go-to person for the most challenging data ingestion and modeling problems.  You actively mentor junior team members and attract others inside and outside your company to join your team. Detail-orientation, thoroughly tested code, and great documentation are the hallmarks of your work but you excel equally well at explaining concepts in “big picture” terms to a less technical audience.  If this describes you and you tick off the boxes below, we would love to hear from you.

Required skills:

  • 5+ years expertise creating and evolving dimensional data models & schema designs to structure data for business-relevant analytics.
  • 5+ years hands-on experience using SQL to build and deploy production-quality ETL pipelines.
  • 3+ years experience ingesting and transforming structured and unstructured data from internal and third party sources into dimensional models.
  • 3+ years experience writing and deploying Python, Scala, or Java code.
  • 3+ years hands-on experience using Hadoop, Hive, Vertica or another MPP database system like AWS Redshift or Teradata.
  • 2+ years experience building and operating realtime streaming data pipelines using Spark Streaming, or Flink
  • Track record of successful partnerships with product and engineering teams resulting in on-time delivery of impactful data products.
  • Demonstrated ability to think strategically about business, product, and technical challenges and implement data solutions which scale to meet future needs.
  • Experience developing scripts and tools to enable faster data consumption.

 

Preferred skills:

  • In-depth understanding of with Kimball’s data warehouse lifecycle.
  • Extensive experience with real-time data ingestion and stream processing.
  • Demonstrated familiarity with industry-leading Big Data ETL practices.

查看应聘者隐私声明

在 Uber,我们不仅仅是接受差异 - 我们会为之喝彩,提供帮助,以实现所有员工、产品和社区的利益。 Uber 以能提供公平机会而自豪,是一个倡导平权行动的雇主。 我们不分种族、肤色、血统、宗教、性别、国籍、性取向、年龄、公民身份、婚姻状况、残疾状况、性别身份以及退伍军人身份等,始终致力于提供平等就业机会。