What You’ll Do
- Owner of the core data pipeline (ETL), responsible for designing, developing, testing, deploying, maintaining & documenting innovative solutions for challenging problems with robust, scalable, reusable, efficient, production-quality software
- Collaborate and communicate closely with analytics to identify, propose and build infrastructure, large-scale data pipelines, data storage strategy, common libraries and useful tools needed to manipulate data so as to create inputs for machine learning algorithms
- Evolve data model and data schema based on business and engineering needs
- Implement systems tracking data quality and consistency
- Develop tools supporting self-service data pipeline management (ETL)
- SQL and MapReduce job tuning to improve data processing performance
- Research and incorporate emerging software infrastructures, tools, and technologies, especially pertaining to data processing
- Usher and evangelize adoption of engineering best-practices and methodology
What You’ll Need
- Minimum 3 years experience building production level software systems, preferably with Python , Java or Go or Scala
- Experience in architecting and building large-scale batch processing pipelines (ETL) using Big Data tools such as Hadoop, Spark, Hive, Presto, Cassandra, etc.
- Comfortable developing in a Linux environment
- Demonstrable track-record of learning and deep-diving as needed into complex existing and new technologies
- Intense sense of ownership, initiative-taking, and a can-do attitude
- Great attention to detail and a data-driven approach to problem solving
- Team-player with a strong collaboration and communication skills, who is able to motivate and mobilize cross-functional teams, and respond positively to feedback
- Experience with large-scale data warehousing architecture and data modeling
- Proficient in at least one of the SQL languages (MySQL, PostgreSQL, MS SQL, Oracle)
- Good understanding of SQL Engine and able to conduct advanced performance tuning
- Comfortable working directly with data analytics to bridge Uber's business goals with data engineering
Bonus Points If
- BS/MS/PhD in Computer Science or a related field
- 1+ years of experience with workflow management tools (Airflow, Oozie, Azkaban, UC4)
About the Team
At the Advanced Technologies Group (ATG), we are building technologies that will transform the way the world moves. Our teams in Pittsburgh, San Francisco, and Toronto are dedicated to mapping, software and hardware development, vehicle safety, and operations for self-driving technology. Our teams are passionate about developing a self-driving system that will one day move people and things around more safely, efficiently, and cost effectively.
At Uber, we believe technology has the power to make transportation more efficient, accessible, and safer than ever before. Self-driving technology has the potential to make these benefits an everyday reality for our customers, but it’s not going to happen overnight. Building best-in-class self-driving technology will take time, and safety is our priority every step of the way. Operating inclusively and transparently, while displaying responsible behavior in a structured development are critical to safety. We at ATG seek candidates who will role model these values.
The Global Supply Chain and Business Operations team scales best in class technology and processes to deliver safe, reliable, and accessible transportation for the Uber platform.