Open source software has been integral to the success of the Uber platform and its underpinning technologies since the company’s founding in 2009. Since then, Uber engineers have created new open source projects as well as leveraged and gave back to existing open source projects to build our data infrastructure, microservices, and mobile apps. Over time, an increased demand for our services around the world required more engineering staff, and subsequently, more open source contributions.
Recognizing the need to create open source standards and best practices at Uber, we launched the Open Source Program Office (OSPO) in 2019. Our OSPO serves as a central resource for management and documentation of open source at Uber, overseeing general open source strategy, establishing policies and processes, and eliminating friction from using or contributing to open source, among other duties.
Establishing our OSPO formalized Uber’s relationship with the open source community, giving our engineers necessary guidelines and structure to ensure proper contribution to and use of open source projects. Throughout 2019, OSPO enabled numerous open source contributions from Uber engineers, while promoting membership in organizations such as the Urban Computing Foundation, the InterUSS Platform for drone interoperability, and the LF Presto Foundation.
An auspicious start
Our efforts behind open source received a boost at the beginning of the year when InfoWorld named one of our past projects, Horovod, as a Technology of the Year award winner for 2019. We open sourced Horovod in 2017 to enable distributed deep learning using TensorFlow, and contributed it to the LF Deep Learning Foundation in 2018. Recent contributions to Horovod extended its use for Keras, PyTorch, PySpark, and Apache MXNet.
We were also happy to see Pyro, another machine learning project we open sourced in 2017, get accepted by the LF AI Foundation Technical Board as an incubation project early in 2019. As a probabilistic programming language, Pyro unifies deep learning and Bayesian modeling to accelerate research that leverages these two techniques.
Further adding to the portfolio of open source deep learning projects, we were happy to contribute Ludwig, a deep learning toolbox built on top of TensorFlow that lets users train and test deep learning models without writing code. By itself, Ludwig makes deep learning usable by a greater number of people, while open sourcing it allows development for many new use cases and adoption by leading ML researchers in industry and academia.
Uber Open Source in early 2019 wasn’t all about deep learning, though. Our data infrastructure necessitates a lot of internal innovation, too, and in that spirit, we gave back some of Uber’s top data platform projects to the broader open source community. One project from that area of work we chose to open source, AresDB, not only enables real-time data analytics, but leverages GPU acceleration to process queries at scale.
Managing a platform serving users around the globe gives our engineers plenty of experience developing for scale, and much to contribute to the community. Another solution we contributed to the community early in 2019, Peloton, combines compute clusters, managing resources for distinct workloads. This unified resource scheduler works in both the cloud and on-premise servers, allowing for scale, workload prioritization, and resource optimization.
In the past, our Visualization team showed its commitment to open source with popular projects such as kepler.gl and deck.gl, and this year we took our public visualization suite a step further with AVS, an open standard for autonomous vehicle visualization. We designed AVS to spur autonomous vehicle development by contributing a tool able to ingest data from any kind of vehicle sensor, and create a visualization of how those sensors perceive their environment. Where previously every organization working on autonomous vehicles had to develop this type of tool internally, now teams at any organization can leverage AVS and modify it for their own use cases.
We maintain a high pace of development at Uber in general, as our engineers continually seek to improve the transportation and delivery services offered on our platform. Our open source engagement matches this pace, as shown by the numerous contributions made to open source projects by Uber engineers.
Alongside this technical work, our OSPO looks for opportunities to support the open source community. In mid-2019 we co-founded the Urban Computing Foundation (UCF), an organization under the auspices of The Linux Foundation devoted to the open development of software that improves urban mobility, transportation, safety, and infrastructure. As an initial contribution, we transferred management of kepler.gl, our open source Web-based data visualization tool, to the UCF.
In 2019, we also submitted Hudi, a project we built in 2017 to support low-latency ingestion and data preparation for HDFS, to the Apache Incubator. Uber leverages Apache Spark, Apache Hadoop, and several other projects from The Apache Software Foundation, so contributing Hudi to the broader Apache community made perfect sense. Under incubation, Hudi is already becoming an important framework within the Hadoop ecosystem.
Approaching the end of the year, we found another area where we could support the community, co-founding the LF Presto Foundation. Uber engineers enthusiastically use and contribute to Presto, a data source-agnostic SQL query engine, which helps our internal users gain insights through data analytics with very low latency, making our decision to partner with other leading technology companies to found the LF Presto Foundation an easy one.
At the end of October, we applauded Jaeger’s graduation from the Cloud Native Computing Foundation (CNCF) Incubation program. Jaeger, a distributed tracing platform created by Uber in 2015, helps companies trace and troubleshoot cloud-based architectures. CNCF accepted Jaeger as an incubation project in 2017. Jaeger follows such projects as Kubernetes and CoreDNS in becoming a full-fledged CNCF open source project due to the open source community’s engagement in its development and its adoption by the broader industry.
October also saw further independent recognition of our open source projects, with InfoWorld awarding Bossies, a recognition given to top open source projects, to both Ludwig and Kraken. Ludwig, as mentioned above, makes it easier to use deep learning, while Kraken seamlessly stores and distributes Docker images at scale.
In December, we announced our conformance to the OpenChain Specification, a widely adopted umbrella project of the Linux Foundation that defines inflection points in business workflows where process, policy, and training should exist to make open source license compliance simpler and more consistent. By following this standard, we can continue to establish trust in our open source partnerships and the broader community.
Finally for 2019, we were very happy to announce the Dev/Mission <> Uber Coding Fellowship, a program that combines our support for open source and promoting diversity and inclusion within the engineering community. The San Francisco-based nonprofit Dev/Mission offers free programming course, mentorship, and resources to help young people from underrepresented communities find careers in STEM fields. Partnering with Dev/Mission since 2017, we have now launched the Uber Coding Fellowship, where Uber engineers will teach eight alumni from Dev/Mission how to use open source tools and frameworks such as Git and Node.js.
Uber Open Source in 2020
The well-documented value of open source software includes such attributes as code transparency, cost-effectiveness, and a broad developer-base. Open source software projects extend from Big Data to artificial intelligence to front-end tooling. It is difficult–in fact, impossible–to imagine Uber’s current engineering culture existing without it.
Recognizing the value of open source software and its community to Uber, our OSPO continues to support open source projects’ sustainability. We look forward to making further contributions to open source projects and the community in 2020.