
How did the pedestrian cross the road?
Contrary to popular belief, sometimes the answer isn’t as simple as “to get to the other side.” To bring safe, reliable self-driving vehicles (SDVs) to the streets at Uber Advanced Technologies Group (ATG), our machine learning teams must fully master this scenario by predicting a number of possible real world outcomes related to a pedestrian’s decision to cross the road. To understand how this scenario might play out, we need to measure a multitude of possible scenario variations from real pedestrian behavior. These measurements power performance improvement flywheels for:
-
- Perception and Prediction: machine-learned models with comprehensive, diverse, and continuously curated training examples (improved precision/recall, decreased training time, decreased compute).
- Motion Planning: capability development with scenario-based requirements (higher test pass-rate, lower intervention rate).
- Labeling: targeted labeling jobs with comprehensive, diverse, and continually updated scenarios (improved label quality, accelerated label production speed, lowered production cost).
- Virtual Simulation: tests aligned with real-world scenarios (higher test quality, more efficient test runs, lowered compute cost).
- Safety and Systems Engineering: statistically significant specifications and capability requirements aligned with the real-world (improved development quality, accelerated development speed, lowered development cost).
With the goal of measuring a scenario in the real world, let’s head to the streets to study how pedestrians cross them.
Driving to observe pedestrians
To understand the various ways a pedestrian might cross the street, we start by driving a SDV in a real neighborhood to observe pedestrian behavior. With a driver behind the wheel and the SDV’s perception system activated, the onboard computer detects, tracks, and records the movement of the pedestrians it sees.
For this example analysis, let’s take a sample of 312 miles of SDV driving over the course of 26 non-consecutive hours around a 1.7 square mile neighborhood, as depicted in Figure 1, below:

The bar height in Figure 1, above, indicates the number of times the SDV drove on a specific lane. The “spikes” at intersections result from the SDV crossing the same intersection multiple times as part of a “grid-coverage” driving pattern.
An ideal driving sample would contain an equal number of miles driven on every street, at every hour, day, week, and month, under the same weather conditions. The down-selected sample of data used in this analysis consists of 312 miles over 26 hours of SDV driving so it’s worth highlighting the resulting selection bias. For example, the driving does not cover all streets equally (Figure 1) and it occurs mostly on weekdays between 9am and 3pm (Figure 2). This informs us that resulting measurements about pedestrian behavior will skew toward describing these streets and times of day.

Data mining the scenario “pedestrian crossing the street”
