Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
AI, Data / ML

Gaining Insights in a Simulated Marketplace with Machine Learning at Uber

June 24, 2019 / Global
Featured image for Gaining Insights in a Simulated Marketplace with Machine Learning at Uber
Figure 1. The Automatic Training Pipeline (center) fetches raw data from Hive using Spark (above) and uses the Simulation ML Models (left) to save data to the Storage Service (right) and Simulation Database (bottom).
Figure 2. To create a simulation, the program first requests a simulation (top left). Next, it fetches backend service model metadata and inputs it in the database. The database then fetches checkpoints based on this metadata from its Storage Service and downloads checkpoints to its disc. These checkpoints then combine with Simulation ML Models (upper center) in the simulator, where the Model Factor instantiates models into the core flow.
Figure 3. Our Driver Movement Hybrid model begins with a driver request simulating its movement and goes through a series of branching “Yes/No” options to reach a conclusion, send driver information to the Routing Engine, estimate the driving speed, and move the simulated driver through the route.
Figure 4. One of the decision tree structures of our stochastic model for simulating off-trip driver distribution starts with “Is the driver in rush hour” at the top and uses a series of “Yes/No” branches to produce leaf nodes at the bottom.
Figure 5. This probability table (with three rows and three columns labeled vertically and horizontally Grid Cell 1, Grid Cell 2, and Grid Cell 3) shows probability values for various grid cells, which represent locations on Earth.
Figure 6. This flowchart begins at a leaf node, uses a map to explain how the simulation fetches an identifiable grid cell to pinpoint a driver, moves to another highlighted map to show how the simulation calculates driver movement probabilities with a transition matrix, and ends with a third map to show how the simulation selects a location as the driver’s destination. This is our tree-based stochastic model for predicting driver destination.
Figure 7. The map on the left highlights areas where drivers are distributed in the real world. The map on the right simulates driver distribution in the same area. Due to the accuracy of our simulation, the maps are nearly identical.
Figure 8. The top half of this image represents the recommendation system, which narrows down and ranks drivers through a few steps. The flowchart for the recommendation system begins by finding all possible drivers for each rider, narrowing driver options down to hundreds, generating links, narrowing driver options down to dozens, then, when there are fewer than ten, ranking them. Once this process is complete, an arrow shows these recommendations moving into the matching phase. At first, the algorithm may link multiple riders to a single driver, but with our maximum bipartite matching algorithm, we pair up individual drivers with single riders.
Figure 9. A larger outer circle of rider nodes uses aggregated feature information from neighbors to direct rider nodes to driver nodes within a smaller circle, which is then aggregated again to narrow the selection to one rider node.
Haoyang Chen

Haoyang Chen

Haoyang Chen is an engineer on Uber’s Marketplace Simulation team.

Wei Wang

Wei Wang

Wei Wang is a senior engineer on Uber’s Marketplace Simulation team.

Posted by Haoyang Chen, Wei Wang