Push notifications are an integral channel for Uber Eats customers to discover new restaurants, valuable promotions, new offerings such as grocery and alcohol, and the perks of becoming a member, among other things. Push notifications are sent from various teams internally such as Marketing, City Operations, and Product. Since marketing push notifications were introduced in March 2020, not only did the list of teams sending notifications grow quickly, but there was also quick growth in volume to billions of notifications per month by the end of 2020.
We quickly noticed a variety of issues:
- There were core quality issues (i.e., notifications sent after hours, invalid deeplinks, duplicates, invalid promo codes, landing users on closed stores, etc.)
- Notifications were being sent within minutes and hours of each other, many with conflicting messaging
- Pushes were being sent to users with little to no personalization in terms of what push the user would prefer to receive, at what time, or at what frequency
- Our marketing teams had introduced a variety of new methods to manually control conflicting messaging, adding upwards of 15 hours per week per team member and shifting valuable strategy work to more menial orchestration tasks
At Uber, we strive to provide a best-in-class user experience, and we quickly realized that a comprehensive approach was going to be required to do this for push. We introduced a system we call the Consumer Communication Gateway (CCG): a centralized intelligence layer to manage the quality, ranking, timing, and frequency of push notifications on a user level.
The system sits in between these incoming pushes and the user’s device. These incoming pushes are persisted in a user’s “inbox” and buffered there. The problem statement can then be formulated as follows:
What is the best “schedule” with which to send the pushes currently buffered for a user?
Planning Over a Time Horizon
The system is designed to consider potential combinations of pushes and delivery times over a fixed time horizon in the future, and pick the schedule that maximizes some objective. To keep things simple, let’s assume that we can only send 2 pushes per day, our time horizon is 7 days, and we have 3 pushes in a user’s inbox. With this setup, we can come up with schedules that send zero or more of these pushes over the next week.
Picking the Best Schedule
With N pushes and S slots under consideration, the number of possible schedules has factorial growth. This makes it infeasible to consider each schedule individually.
Instead, we formulate the problem as an Assignment Problem: if each assignment of a push to a time has some value (score), what is the schedule that maximizes the sum of scores from its assignments?
This problem can be solved efficiently with an integer linear program solver. In the linear program formulation, we can also encode business logic for pushes with linear constraints:
- Push expiration time (e.g., must send the push before a promotion expires)
- Push send window (e.g., only morning hours)
- Daily frequency cap (e.g., at most 2 pushes per day)
- Minimum time difference between push notifications (e.g., 8 hours between pushes)
- Restaurant open hours
Given a set of candidate push notifications for a user and a set of possible delivery times, the optimization framework identifies the optimal (push, time) pairs, as follows:
Where xi,t is a binary indicator of whether to send push i at time t, and si,t is the score (value) of sending push i at time t.
Besides those shown above, many other constraints can be encoded with a linear inequality.
The use of a linear program solver has some advantages over greedier approaches. For example, it notices when a push notification is expiring soon and prioritizes getting it sent out quickly, even if other pushes in the inbox appear to be more valuable in the near-term.
Additionally, it can exploit the different performance that pushes are predicted to have at different times. We might predict Push A to perform well at lunch-time and dinner-time, but push B to perform well only at lunch time. So we can send B at lunch and A at dinner in order to maximize the value extracted from both.
When the size of the inbox exceeds the frequency cap, the most valuable pushes will be assigned to a time for delivery and the remaining ones will be dropped.
The value of a (push, time) pair is determined with a machine learning model that predicts the probability of a user u making an order within 24 hours of receiving push i at time t:
Specifically, we have trained an XGBoost model on historical data to predict the conversion probability given the following features:
- Time features: hour of day, and day of week
- Push features: category, content identifier, and deeplink
- User features: order history by day of week and mealtime, and engagement history with push notifications by day of week and mealtime
Given the strong class imbalance in the dataset (a low percentage of pushes are associated with an order), we downsample the negative class for model training. In addition, we prune the least important features when building the final version of the model.
The model shows good predictive power during online evaluation, with a higher predicted conversion rate corresponding to higher true conversion rate, as shown in the figure below:
Looking at the plot above, the behavior of the model is non-monotonic for scores less than 0.2; however, less than 0.4% of the model’s predictions have such scores.
The system was implemented with 4 components which have distinct responsibilities. At a high level:
- The Persistor stores pushes onto non-volatile storage (inbox) for access by the other components
- The Schedule Generator fetches all buffered pushes for a user from the inbox, and determines the optimal schedule for them
- The Scheduler receives as input a chosen time for a chosen push, and makes a best attempt to trigger the delivery of that push at that time
- The Push Delivery component sends the push along to downstreams responsible for device delivery
The persistor is the entrypoint to the Push Intelligence system, receiving pushes intended for delivery to the user via gRPC.
Then, it stores the push content along with its metadata (promotion, restaurant hours, etc.) into the inbox, which is a sharded array of MySQL datastores. Uber infrastructure abstracts away the sharding and provides an API to interact with a single document store (docstore). The inbox table is partitioned by the recipient’s user-UUID, allowing for horizontal scaling with minimal hotspots, and co-location of multiple pushes intended for the same user.
The schedule generator is triggered each time a push is persisted into the inbox for a user. It fetches all pushes buffered for that user, even if they were already scheduled. This allows it to reschedule previous pushes, with the existence of the newest push taken into account.
The schedule generator uses Uber’s ML platform (Michelangelo) in conjunction with a linear program solver, as described above, to determine the optimal schedule for the push notifications that the user has in their inbox.
When the Schedule Generator has found a schedule, it calls the scheduler for each push-time assignment. The scheduler has to provide a distributed cron-like system with a throughput upwards of tens of thousands triggers/second.
We did this with Cadence, which provides fault-tolerant processing of cron-like tasks, but with a lower throughput. Push-time assignments are buffered into one of many Kafka topics for each hour in the time horizon. Then, Cadence turns on and off consumer group consumption from one topic at a time.
The scheduler is idempotent, and allows a scheduled push to be rescheduled to another time.
When the scheduler determines that a scheduled push is ready for sending, it triggers the push delivery component. This component is responsible for some last-mile checks, such as whether Uber has enough delivery drivers at the current time.
In addition to delivering pushes to users, it can instead mark them as suppressed (if they failed some last-mile check) or simply expired (if it expired before getting a chance to be delivered). This component also works asynchronously behind a Kafka buffer, providing smoother load and retries.
The initial results from early experiments have been very positive. We’ve not only seen a reduction in opt outs, but also a strong increase in the relevance of the notification. In the future, we will be looking to further boost the impact of messaging intelligence in a few different ways:
- Improving the core models. While our initial models focused on ranking and timing have been successful, we are still in a foundational state. First and foremost, we would like to expand the model to take into account multiple objectives (conversion, showcasing Uber One membership, grocery, and retail stores, opt outs, etc.). We would also like to move to near-real-time (NRT) features to get quicker signal, and also move to a deep learning algorithm.
- Expanding across channels. On Uber Eats, more volume comes from email than push. Email is also a channel with a wider range of authors and content. We are able to leverage the architecture we have built, only requiring new models to be built and integrated for email. Our first focus will be on basic guardrails for frequency, but will shift focus to timing and ranking as we continue to optimize, similar to push. We’ll also start feeding the signal from out-of-app intelligence models to our in-app intelligence models to better optimize the end-to-end experience for users.
- Expanding across the platform. While Uber Eats is the primary focus given the volume of communications, there has been an increasing amount of volume on the core Uber (Rides) app and Postmates. Given that a significant amount of users are dual-platform users, we plan to expand intelligence to Rides communications as well to optimize communications on a user level.
While we have come a long way, we still have a ways to go but have strong conviction we can get there. Stay tuned for more updates as we push forward! (pun intended) 🙂
This work would not have been possible without the countless hours of support and work done by multiple teams, including Consumer Communications Gateway, Operations, Marketing/CRM, and Product.
Main Image by Mike MacKenzie www.vpnsrus.com