14 мая 2026 г.

Beyond Prediction: Solving the Multiple Knapsack Problem at Scale: How Uber Optimizes Incentives

Rajat Tulasyan

Staff Engineer

Bhanu Aggarwal

Engineering Manager

Share this article

Introduction

What do scheduling television advertisements, managing a financial portfolio, and distributing customer incentives have in common?

At first glance, they seem like vastly different industries. However, they share the same mathematical DNA: they’re negotiations with constraints. They’re instances where we seek to maximize value—revenue, ROI, or space—while fitting strictly within limited bins, whether those bins are time slots, capital risk limits, or shipping crates. In computer science, this is the MKP (Multiple Knapsack Problem).

In the modern data stack, this problem has evolved. Companies rely on uplift models to identify the most effective treatment for a user. But when you have millions of users, hundreds of concurrent treatments, and strict quarterly budgets, the question shifts from “Which treatment is most effective?” to “Which allocation maximizes the overall marketplace efficiency without breaking the bank?”

Enter Tarot (Targeting Orchestrator), Uber’s internal targeting platform. By combining budget pacing with a multi-lever optimizer, Tarot solves this by treating incentive allocation as a massive-scale MKP. Tarot ensures that we are driving toward an optimal trade-off between user experience and quarterly budget constraints..

For the engineering community, this presents a classic distributed systems challenge: How do you solve an NP-hard optimization problem without buckling under the weight of millions of requests? This blog outlines how we moved beyond heuristics to architect a production-grade constraint solver capable of high-stakes business decisions.

The Incentive Challenge: A Textbook Multi-Knapsack Problem

While MKP appears in logistics and finance, at Uber, Tarot’s first use case in Uber is Incentives. To understand the complexity, we can map the reality of growth marketing to the five components of a classic MKP.

The Knapsacks (Team Budgets)

Each organizational unit at Uber operates with a finite capacity of money to spend per quarter to achieve their specific objectives.

For example, the Mobility Growth team has a dedicated budget to boost drivers on the platform, while the Delivery Growth team has a completely separate bucket to increase courier availability. The optimization must respect the hard limits of each team’s quarterly allocation such that we are not using budget from the Delivery team to improve driver (not courier) availability on the platform.

The Items (The Treatments)

The items we need to pack are the specific Incentive Programs. These are the levers we pull to influence behavior.

For example, a sign-up reward for a new driver joining a specific line of business (Rides versus Uber Eats), like a streak bonus to ensure continuous availability during busy time windows.

The Weight (The Cost)

Every item has a weight, which corresponds to the monetary cost of the incentive.

For example, offering a $100 bonus weighs more against the budget than a $20 quest. In a massive marketplace, even small differences in weight multiply rapidly when targeted across millions of people.

The Value (The Objective)

The value is the predicted incremental Impact. This is where our uplift models come into play. We’re looking for the incremental value generated by that specific incentive that improves someone’s earnings and the marketplace health.

For example, the expected increase in the number of trips taken, or the additional earnings per hour a driver earns because they received that incentive.

The Goal (The Optimization)

Finally, the goal is to solve the puzzle: Maximize total ROI across all of Uber teams. We must allocate the right incentives to the right people to satisfy competing priorities across different lines of business and geographies.

The Pre-Tarot Landscape: First-In Wins

To prevent interface clutter, we strictly limit concurrent offers. Historically, we resolved contention using a naive FIFO (First-In-First-Out) strategy, where early incentives often blocked more relevant ones. Targeting relied on manual heuristics rather than a consistent data science driven consistent approach, while budget management was entirely reactive. Operations teams manually adjusted rewards to meet quarterly limits, leading to inefficient capital allocation and inconsistent user experiences.

Architecture

Tarot relies on a modular architecture where a central orchestrator coordinates between data platforms, budget services, and optimization solvers.

Figure 1: Architecture diagram with all major components of Tarot and their interactions.
Key Components

Orchestrator: The heart of the system. It manages the targeting life cycle, communicating with components to fetch data, request optimizations, and trigger assignments. It executes a workflow based on different configurations and interacts with other components in both event-driven and sync mode depending on qps and consistency needs.
Segmentation Platform: Defines user cohorts, providing a superset of eligible users.
Budget Pacer: A control-loop component that manages spend velocity by analyzing the configured total, realized spend, and projected future spend.
Multi-Lever Optimizer: The decision engine. Designed with a plug-and-play interface, it currently utilizes the Google OR-Tools CP-SAT solver to handle optimization constraints.
ML Platform: Provides prediction data (uplift, conversion probability).
Domain Services: Downstream services responsible for execution (like applying the promotion to the person’s account).

Execution Flow

The Tarot workflow transforms a broad list of eligible users into a precise set of optimized assignments.

Workflow diagram showing a sequence of processes across seven components: Cron, Orchestrator, Segmentation Platform, ML Gateway, Pacer, Optimizer, and Domain Services. The process starts with Cron triggering a targeting run in Orchestrator, which then sequentially performs actions: getting all users from Segmentation Platform, getting predictions from ML Gateway, retrieving remaining budgets from Pacer, running optimizer for all opportunities in Optimizer, and assigning incentives in Domain Services. Each step is represented by horizontal arrows connecting the components.

Figure 2: Sequence diagram explaining all the steps in the Tarot workflow.

Here’s how it works:

Initialization. The workflow is triggered via configurable Cron schedules. This stage handles the complexity of varying incentive cadences, defining which incentive groups are optimized jointly and which are processed independently.
Opportunity generation. The Orchestrator ingests target user pools from the Segmentation Platform. It creates a Cartesian product of users against all valid levers, generating a massive candidate list of potential opportunities.
Prediction. We query the ML Platform to score each opportunity. Using models trained on historical experiment data, we predict the expected ROI and cost for every individual user-lever pair.
Budget pacing. We calculate the available cap for the current run, reconciling total budget against realized and estimated spend.
Optimization. Formulated as an MKP, the system selects the subset of assignments that maximizes total value without violating constraints.
Final assignment. The resolved list of optimal assignments is published to Domain Services for immediate execution.

Calculating ROI

At Uber’s scale, calculating the ROI (Return on Investment) for any given opportunity is rarely linear. Because our multiple lines of business like Mobility and Delivery, often intersect, optimizing for one can inadvertently destabilize the other.

One example of this is cross-vertical cannibalization. A campaign designed to boost Delivery trips might succeed in isolation but could simultaneously reduce Mobility bookings if users substitute a ride for a delivery. While the primary metric improves, the net impact on the overall marketplace could be negative.

Another example is strategic weighting versus immediate profit. Emerging bets (like a new product launch or geographic expansion) often show lower immediate engagement compared to established markets. However, these initiatives require sustained investment to reach critical mass.

To resolve the conflict between immediate engagement and long-term strategy, we treat ROI as a multi-objective optimization problem rather than a single metric prediction. We engineered a two-layer architecture:

The prediction layer (objective). Instead of predicting one particular metric directly, our models forecast the raw uplift across a state vector of metrics [x], such as Rides trips, Uber Eats orders, driver utilization, and earnings per hour.
The valuation layer (subjective). We assign a dynamic weight vector (w) to these metrics based on the specific market context and current strategic bets (like weighing growth higher than efficiency in new markets).

The final ROI is the dot product of the strategic weights and the predicted metric uplifts, as shown in Figure 3.

ROI equals w multiplied by delta x, displayed in a code-like style with yellow and white text on a dark background.

Figure 3: Final ROI equation.

This allows us to distill complex trade-offs into a single scalar score (apples-to-apples comparison) and adjust business priorities (w) without retraining the underlying prediction models (x).

Budget Pacer

ML models are subject to drift, but financial budgets are deterministic. To resolve the tension between probabilistic predictions and hard financial limits, we introduced the Budget Pacer. While budgets can be defined on a weekly granularity, the agreements are on a quarterly level. This means that the pacing system can adjust total spends across the periods as long as we maximize utilization (aiming for around 100%) without overspending over a quarter.

The Pacer acts as a feedback control loop. It continuously recalculates the remaining budget by reconciling configured budget, observed spend, and predicted liability. If actual costs exceed predictions, the Pacer detects the variance and constrains the budget for future periods.

A multi-line graph tracking cost and budget data over time, featuring observed costs (green jagged line), predicted costs (green dashed line), budgets (solid blue line), adjusted budgets (dashed blue line), and remaining budget (gold line). Labels and arrows identify each line, with a vertical dashed line marking a transition point.

Figure 4: Time series chart explaining how pacer works.

The Pacer calculates the available budget for the current run by reconciling three factors:

Configured budget: The total limit defined by finance.
Observed spend: The actual cost incurred from previous assignments.
Predicted liability: The forecasted cost of active incentives that haven't yet been paid out.

By continuously recalculating the remaining budget based on these factors, the system self-corrects. If actual costs come in higher than predicted, the Pacer detects the variance and constrains the budget for future periods, smoothing out fluctuations over time.

Optimizer

If constraints are the knapsacks and incentives the items, the Optimizer is the brain solving the puzzle. Acting as the central command, it synthesizes paced budgets, predicted ROI, and costs into a single mathematical mandate. Its directive is computationally heavy: exhaust the available budget to determine the mathematically optimal investment for every user.

Mathematical optimization problem with constraints, aiming to maximize a sum involving binary variables and estimated returns, subject to budget and eligibility constraints. Annotations explain maximizing ROI, maintaining budget balance using estimated costs, and ensuring user eligibility for incentives.

Figure 5: Mathematical equation for the optimization problem.

Flowchart showing three inputs—Audience, ML Prediction for the Audience, and Available Incentives—feeding into an Optimizer, which then outputs to Incentive Assignments.

Figure 6: Optimizer inputs and outputs.

Optimization isn’t static. To handle evolving business needs, we designed a plug-and-play architecture that abstracts the problem definition from the solution method. This allows us to swap solvers based on scale and complexity. Whether optimizing incentives today or courier positioning tomorrow, the system remains goal-agnostic—it cares only about the constraints.

Figure 7: Plug-and-play architecture of Optimize
Development Challenges

Budget Pacer: The Challenge of Time-Series Constraints

Initially, we modeled costs as time-series curves over an incentive’s duration. While this mirrored the real-world accrual of cost, this introduced significant friction:

Modeling complexity. Accurately predicting daily user progress required complex models and massive datasets.
Optimization latency. Dimensionality exploded. The solver had to respect daily constraints (N x T) rather than a single scalar (N), drastically increasing solve time and memory load.
The choke point phenomenon. A predicted spike on a single day could block the entire assignment despite ample budget on other days, creating artificial bottlenecks.

To address these scaling issues, we shifted our strategy to collapse the total predicted cost of the incentive to the day of assignment.

Two-panel diagram comparing cost attribution strategies. Panel A, labeled 'V1: Time-Series Constraints,' shows two cost curves (Incentive A and B) over seven days, with a dashed line marking a daily budget constraint. Incentive A peaks sharply on day 3, exceeding the budget, while Incentive B rises more gradually. Panel B, labeled 'V2: Collapsed Cost Attribution,' displays two bars for total costs of Incentive A ($230) and Incentive B ($180) against a green dashed line representing the total budget available ($300), with both costs below the budget.

Figure 8: Cost-based accounting versus accrual accounting.

Attributing the full cost to the assignment timestamp simplified the problem space from a vector to a scalar. This eliminated the choke point effect and reduced computational overhead. Benchmarking confirmed this approach yields smoother assignments and near-100% budget utilization, while restoring granular geographic control.

In the first version of Pacer, we were reaching around 68% utilization. With the single day cost attribution, based on our simulations against a variety of hazard curves, we are looking at 99.99% budget utilization.

Optimizer Scaling Pains: The Pivot from Linear Programming to CP-SAT

Building Tarot wasn’t linear. Our biggest hurdle was selecting a scalable solver.

Initially, we modeled the incentive allocation challenge as a pure LP (Linear Programming) problem. On paper, this made sense; we were maximizing a linear objective function (total ROI) subject to linear constraints (budget). We chose HiGHS, a high-performance open-source linear optimization software, as our engine.

For small batches, it worked beautifully. But as we ramped up traffic, the complexity curve hit us hard.

The moment we scaled to around 100,000 users with just a single lever (incentive type), the solver hit a wall. The search space for the optimal solution became so vast that HiGHS struggled to converge.

We observed job runtimes spiraling out of control. It began taking more than 24 hours just to find a feasible solution for a relatively small user set. In a production environment where teams need to move fast, a day-long wait was unacceptable.

Recognizing our problem was discrete (combinatorial) rather than continuous, we pivoted to CP-SAT. The solver’s ability to handle logical constraints changed everything:

The same dataset of 100,000 users that choked the LP solver for over 24 hours was solved by CP-SAT in a matter of minutes. Encouraged by this win, we stress-tested the CP-SAT implementation with millions of users across multiple simultaneous levers. Even at this massive scale, the solver consistently returned optimal allocations within our strict SLA timeframes.

This proved that for industrial assignment problems, the ability to rapidly prune the search space outperforms the mathematical purity of standard LP.

Use Cases at Uber

Tarot provides targeted incentives to help drivers and couriers optimize their time on the Uber platform, enabling them to increase earnings from their current work or discover other efficient alternative earning opportunities.

Next Steps

While Incentives represent the initial application, the optimization problem is pervasive across Uber. We’re exploring its use to optimize rider and courier onboarding, communication strategies, and global opportunity recommendations.

Our next major hurdle is automating the tradeoff between exploring for new higher-value opportunities and maximizing immediate returns. Currently, testing new strategies requires manual intervention. We aim to build a closed-loop system where the engine automatically allocates exploration budgets to test new incentive structures, gathers feedback, retrains the uplift models, and injects validated strategies back into the optimization pool.

Conclusion

Tarot demonstrates that effective growth engineering requires more than just predictive ML; it demands rigorous mathematical optimization. By framing incentive targeting as a massive multi-knapsack problem, we transformed a complex web of competing business goals into a solvable equation.

For the engineering community, the key takeaway is the power of hybridizing uplift modeling with constraint programming (CP-SAT) and feedback loops like budget pacing. This architecture moves beyond simple heuristics to driving toward the optimal trade-off between user experience and ROI.

Acknowledgments

Thanks to all the members from the Growth Incentives Product, Engineering and the Science team for making this a reality.

Cover Photo Attribution: "231025_DSTZ_MOOD-14-BIKE_CAMERA_S_22165_VS_R1" by RJ Shaughnessy | Giant Artists Inc. is licensed to Uber.

Figure 8 Attribution: Image generated using Google Gemini

Google® is a trademark of Google LLC and this blog post is not endorsed by or affiliated with Google in any way.

Stay up to date with the latest from Uber Engineering—follow us on LinkedIn for our newest blog posts and insights.

Авторство:

Rajat Tulasyan

Staff Engineer

Rajat Tulasyan is a Staff Engineer in the Growth Org in Amsterdam. He has worked in several domains over the last 8 years at Uber.

Bhanu Aggarwal

Engineering Manager

Bhanu Aggarwal is an Engineering Manager within the Growth Org in Amsterdam. Over his eight-year tenure at Uber, he has managed and delivered multiple platforms in the incentives domain.

Связанные статьи

12 статьи(ей)

Параметр фильтрации:

Все категории