Skip to main content
Uber logo

Start ordering with Uber Eats

Order nowOrder now

Start ordering with Uber Eats

Install the appInstall the app
Data / ML

Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber

September 6, 2017 / Global
Featured image for Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber
Algorithm 1: Our MC dropout algorithm is used to approximate both model uncertainty and model misspecification.
Algorithm 2: Our inference algorithm combines our inherent noise estimation and MC dropout algorithms.
Figure 1: Our neural network architecture incorporates a pre-training phase using a LSTM encoder-decoder, followed by a prediction network. Input for this neural network includes learned embedding concatenated with external features.
Table 1: The SMAPE is compared across four different prediction models and evaluated against the test data.
Figure 2: Daily completed trips in San Francisco during eight months of the testing set. True values are represented by the orange solid line, and predictions by the blue dashed line, where the 95 percent prediction band is shown as the grey area. (Note: exact values are anonymized.)
Table 2: Coverage of 95 percent predictive intervals in the Encoder + Prediction Network + Inherent Noise Level (Enc+Pred+Noise) scenario is determined using the test data.
Figure 3: The estimated prediction standard deviations on six U.S. holidays during our testing period suggests that New Year’s Eve results in the largest standard deviations across all eight cities. (Note: exact values are anonymized.)
Figure 4: The time series training set, visualized in the embedding space, is composed of points that represent a 28-day segment, colored according to the day of week. In this PCA visualization, we evaluate the cell states of the two LSTM layers, where the first layer with dimension 128 is plotted on the left and the second layer with dimension 32 is plotted on the right.
Figure 5: Four sample metrics (measured in minutes) track rider behavior during a 12-hour span, and anomaly detection is performed on the 30 minutes following this interval. The neural network constructs predictive intervals for the following 30 minutes, visualized by the shaded area in each plot.

Posted by Lingxue Zhu, Nikolay Laptev