Hello Uberites! We’re back again with another #UberData post. We recently launched in NYC, and for us that means ‘mo money, ‘mo problems. Take estimated arrival times (ETAs), for example. When we launch a new city, we simply don’t have historical data to draw estimates from. That’s a problem because not only are ETAs woven into virtually every corner of our supply chain and dispatch systems, but we also show them to riders to make decisions based on wait times.

We don’t have estimates at city launch, but Google does. Google has services that predict travel times and that’s what we used to start in NYC. Unfortunately, we found that Google’s ETA predictions were, on average, off by 3.6x the actual pickup time in NYC during our first week. (Thanks, New York crosstown traffic and congestion!) So we’re working on using our own algorithms instead of Google API for ETAs.  And as our data shows, we’re better at it:

Uber vs. Google

We measure our predictor’s accuracy using the mean square error; the lower the error, the better. And as the next graph shows, as we accumulate rides we’re also getting even better by the day:

Over time, the gap between our predictor’s accuracy and Google’s widens in our favor. Math!

To be completely fair, we’re not claiming we’re better than Google. Our domain is more restricted – with reliable and experienced drivers from which we can pull real-time data from. Besides, Google APIs never gave claims of accuracy (“it’s for planning purposes, etc.”) and they’re great for almost everything else, such as geocoding street addresses into latitude/longitudes and then back again. In other words, we love them for everything except for accurate ETAs.

ETAs are just one, albeit important, part of our pool of #UberData projects. As we improve on other areas, such as demand prediction or supply positioning, ETA accuracy can lag behind. In fact, some methods to improve actual wait times may negatively impact our predictor. So we’re in a give and take here, and we iterate on each of the core projects to make the system truly Uber in the long run.

Regardless, expect all things Uber to become better, if not more accurate, as our ridership rises.

#UberData is series of posts by the Uber Engineering Team. We’re highlighting cool and amazing things that data and math can show us to make a better Uber product. (And occasionally make us chuckle.)