Skip to main content

Tell us your location

Please enter your nearest city name to help us display the correct information for your area

Software Engineer, Hadoop Analytics & Infrastructure

Engineering in Palo Alto, CA

About Uber

 

We’re changing the way people think about transportation. Not that long ago we were just an app to request premium black cars in a few metropolitan areas. Now we’re a part of the logistical fabric of more than 600 cities around the world. Whether it’s a ride, a sandwich, or a package, we use technology to give people what they want, when they want it.

 

For the people who drive with Uber, our app represents a flexible new way to earn money. For cities, we help strengthen local economies, improve access to transportation, and make streets safer.

 

And that’s just what we’re doing today. We’re thinking about the future, too. With teams working on autonomous trucking and self-driving cars, we’re in for the long haul. We’re reimagining how people and things move from one place to the next.

About the role

 

Uber is currently looking for engineers with expertise and passion for building large scale data analytics systems. The Hadoop Analytics and Infrastructure team is part of the Data team at Uber. Based in Palo Alto and San Francisco, the team is responsible for building the interactive and batch querying systems, advanced data processing platforms and the underlying storage and resource management infrastructure.

 

Our mission is to design, develop, and manage world-class big data systems which are highly scalable, available, fault tolerant, secure, powerful and efficient to empower data driven decisions for every group within Uber; from data scientists to city operations teams, from product engineers to marketing. Some of the products that we power include driver/rider matching, ETA calculations, Image recognition for Maps and autonomous vehicles, secure data access, adhoc exploration of city level patterns etc.

What you’ll do

  • Deliver a completely self-service parallel compute framework based on Apache Spark for a variety of near real-time and big data applications running on YARN and Mesos
  • Provide interactive SQL access to 10s of PB of data with a few seconds of latency with Presto
  • Provide Hive as a highly reliable and available service for Uber’s bulk data processing needs; provide Uber specific optimizations and features such as geo-spatial-temporal support
  • Build a highly scalable, reliable and efficient data storage system based on HDFS for Uber’s data lake
  • Interactive workbench to boost productivity of Uber’s Data Scientists 
  • Data Security with Authentication, Authorization, and Auditing mechanisms

What you’ll need

  • We are looking for curious, self-motivated engineers with strong coding, testing, debugging and design skills
  • Solid understanding of distributed systems and system fundamentals such as concurrency, multi-threading, locking etc.
  • Past data infra experience or knowledge about Hadoop eco-systems is not necessary (we will mentor you!) but if you do have past data infra experience, we really want to chat with you :-)
  • Be customer obsessed and have ability to translate customer and technical requirements into detailed architecture and design
  • Bonus points if
  • Experience with large scale data analytics, query optimization and execution, highly available/fault tolerant systems, replicated data storage, and operating complex services running in the on-prem or cloud are all pluses
  • Under the hood experience with some of the big data analytics technologies we currently use such as Apache Hadoop (HDFS and YARN), Hive, Spark, Docker/Mesos, and Tez. Presto is a plus. Under the hood experience with similar systems such as Vertica, Apache Impala, Drill, Google Borg, Google BigQuery, Amazon RedShift, Kubernetes, Mesos etc. is also a plus.

About the Team

 

The Hadoop Analytics and Infrastructure team is responsible for providing all data storage and batch processing needs to the rest of the company. We have a small tightly knit team with a diverse set of backgrounds such as Facebook, Google, Cloudera, Hortonworks, Amazon, LinkedIn, Twitter, Pinterest, Dropbox, other startups and recent college grads. Areas listed above are technically deep areas that are undergoing massive innovation in the community. Uber, as a business, is also growing rapidly, and Data is at the heart of many products e.g. Pricing predictions, route determination, ETAs, fraud detection, storage and processing of Autonomous Vehicle logs etc.


By solving these business problems you will not only be helping Uber but also have a front row seat to build and innovate the future Big Data systems and contribute them back to open source. This is an exciting time to be a Data Infrastructure engineer at Uber. Be sure to checkout our engineering blog to learn more about the team.


See our Candidate Privacy Statement

At Uber we don’t just accept difference—we celebrate it, we support it, and we thrive on it for the benefit of our employees, our products and our community. Uber is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.