Imagine a warehouse with thousands of trays arriving daily. Your task is to arrange them into neat lanes, but there's a catch: you need to use as few lanes as possible to save floor space, while also keeping each lane short enough for your forklifts to operate efficiently. This is a classic large-scale optimization puzzle where off-the-shelf solvers can grind to a halt. In this post, we'll explore how to model this challenge, starting with a standard mathematical formulation and escalating to a powerful technique called Column Generation, designed specifically for problems of this scale.
we can view this as a sophisticated variant of the classic Bin Packing Problem. In the standard problem, you minimize the number of bins. Here, we have a trickier, bi-objective goal: we want to minimize the number of 'bins' (lanes) and minimize the length of the longest 'bin'.
What is a bin packing problem?
Bin packing problem is a classic optimization problem where you have a set of items and a set of bins, and you want to pack the items into the bins such that the total volume of the items is less than or equal to the volume of the bins.

According to Wikipedia, the bin packing problem is a combinatorial optimization problem where the goal is to pack a set of items into a finite number of bins or containers, minimizing the number of bins used while satisfying certain constraints. The items are typically of different sizes, and the bins have a fixed capacity. The problem is NP-hard, meaning that no efficient algorithm is known for solving it optimally in polynomial time.
A practical-alike example
Currently, imagine you store trays in a linear lane system positioned against a wall in one corner of the facility. Each lane extends perpendicular from the wall and accommodates trays in a single-file arrangement. While you initially considered stacking trays to maximize vertical space, safety concerns led us to maintain single-level storage.
The standard lane configuration calls for:
- 1-meter lane width
- 1.9-meter maximum lane length
However, real-world operations often require us to exceed these parameters. When daily tray volumes don't fit within our ideal dimensions, we face two scenarios: lanes that stretch beyond 2 meters, or an increased number of lanes. While both solutions are workable, they create challenges for our forklift operators who need clear, straight pathways perpendicular to the storage lanes.
Additionally, our trays come in three distinct sizes—Small, Medium, and Large—each with fixed dimensions that must be oriented in a specific direction within the lanes due to handling requirements.
Mathematical formulation
First, we define the following sets and parameters, decision variables, objective function, and constraints for the problem.
denotes the length for tray .
For decision variables, we have:
We have the following objective function:
The objective function balances two competing goals:
- Minimize the maximum lane length
- Minimize the number of lanes used
We also have the basic constraints for the problem:
Unique assignment: Each tray is assigned to exactly one lane.
Lane length definition: The length of each lane equals the sum of assigned tray lengths.
Lane usage constraint: Lane usage is activated when trays are assigned.
This is the Big-M constraint and we define in our problem.
Maximum lane length definition: Define the maximum length across all lanes.
Feasibility: At least one lane must be used.
This constraint ensures that the maximum lane length z is at least as large as the longest single tray, which is a logical lower bound.
Variable Domains:
Sensitivity analysis
Sensitivity analysis is a technique used to understand how changes in the input parameters of a model affect the output. In our problem, we can perform sensitivity analysis on the weight parameter .

The figure illustrates that the trade-off is unavoidable: as we reduce the number of lanes (red line going down), the maximum lane length must increase (blue line going up). The practical decision comes down to what matters more operationally: minimizing the number of lanes (to save floor space) or keeping lanes shorter (for forklift efficiency).
It's easy to present this chart to stakeholders and ask the following business question: ``Given your forklift operations and available floor space, which part of the figure represents the best compromise for your daily operations?" If the idea length for the lane is equal or lower than 2 meters, the number of lanes will be at least 7 (according to the 16 tray testing). This gives them a visual tool to make informed decisions based on their specific operational priorities rather than leaving it as an abstract optimization problem.
Scaling Up: When MILP Hits a Wall, Enter Column Generation
The MILP formulation is clear and exact. For a few dozen trays, it works beautifully. But what happens when you have 700, 1,000, or 10,000 trays? The number of possible assignments () is for each tray , which is prohibitively large. Using this problem as an example, even with advanced MILP tricks like symmetry breaking constraints, and variable fixing strategies, the problem becomes intractable.
This is a classic symptom of a problem perfectly suited for Column generation. The core idea of CG is to change our perspective. Instead of thinking about assigning individual trays to lanes (a massive number of decisions), we think about building a library of "good" lane configurations, called patterns. Then, our only decision is how many times to use each pattern from our library. We don't need to know every possible pattern in the universe to start. We can start with a few obvious ones and then intelligently generate new, better patterns only when we need them. This is the magic of column generation: it avoids getting lost in a sea of possibilities by only considering the most promising ones.
The New Structure: A Master Problem and a Pricing Subproblem
The CG approach splits the problem into two coordinated parts:
- The Master Problem: This is the manager. It looks at the current library of known patterns and tries to find the best way to combine them to satisfy all the tray requirements at the minimum cost. It's a relatively small and easy-to-solve problem.
- The Pricing Subproblem (or "Pattern Generator"): This is the creative expert. The Master Problem asks the Subproblem a crucial question: "Given my current costs, can you find a new, unlisted pattern that would improve my overall solution?" These two problems "talk" to each other in a loop until the Subproblem can no longer find any better patterns.
The Master Problem: Choosing the Best Patterns
Let's formalize the manager's task. We define a "pattern" as a valid combination of trays that can fit into a single lane. The Sets and Parameters are as follows:
For each pattern :
- : set of trays included in pattern
- : total length of pattern
Decision Variables:
Objective Function:
Constraints:
Tray coverage: Each tray must be covered exactly once
Maximum length definition:
Variable domains:
This is a Linear Program (LP), which is very fast to solve. When we solve it, we get two crucial pieces of information: the current best solution and the dual values. The dual values are the shadow prices of the constraints. A higher dual value means the Master Problem is struggling to place tray and would pay a high price for a pattern that includes it.
The Pricing Subproblem: Finding a Better Pattern
Let be the dual variable for Tray coverage constraint and be the dual variable for Maximum length definition constraint. The pricing subproblem finds the pattern with the most negative reduced cost. The subproblem's goal is to use the prices ( and ) from the Master Problem to find a new pattern that is a "good deal."
Decision Variables:
Objective Function:
Maximize the reduced cost:
The term represents the coefficient of in the master problem objective.
Constraints:
Length feasibility: Pattern must fit in a lane
Compatibility constraints: Large and medium trays cannot coexist
where and are the sets of large and medium tray indices, respectively.
Variable domains:
This subproblem is a variation of the Knapsack Problem, which is NP-hard but can be solved very quickly for the sizes we care about.
The Column Generation Procedure: A Dialogue
The overall algorithm is an iterative conversation:
- Initialize: Start with a simple set of patterns (e.g., one pattern for each tray by itself). This guarantees a feasible, if terrible, starting solution.
- Master Solves: Solve the Master Problem LP using the current set of patterns . Get the dual prices .
- Master Asks Subproblem: The Master sends the prices to the Subproblem and asks, "Can you find a new pattern with a negative reduced cost based on these prices?"
- Subproblem Answers:
- If YES: The Subproblem solves its knapsack problem. If the optimal objective value is greater than 0 (meaning the reduced cost is negative), it has found a profitable new pattern. This pattern (defined by the
x_i
variables) is sent back to the Master. The Master adds this new "column" to its set and goes back to Step 2. - If NO: If the Subproblem's optimal objective value is 0 or less, it means no profitable pattern exists anywhere in the universe. The conversation is over. We have found the optimal solution to the LP relaxation.
- If YES: The Subproblem solves its knapsack problem. If the optimal objective value is greater than 0 (meaning the reduced cost is negative), it has found a profitable new pattern. This pattern (defined by the
- Get the Final Integer Solution: Once the loop terminates, we have a great collection of highly effective patterns in . We solve the Master Problem one last time, but now with the additional constraint that the variables must be integers. This gives us our final, practical warehouse plan.
The Payoff: Solving the Unsolvable
After implementing this advanced approach, the computational improvements were dramatic. While the original MILP struggled with trays, the column generation algorithm delivered outstanding performance:
- For an instance with trays, we found a solution with an optimality gap of just 0.38%, which is excellent for practical purposes.
- A large instance with trays was solved in 60.46 seconds.
- A massive instance with trays was solved in just 131.58 seconds.
This demonstrates the power of column generation: by reframing the problem and intelligently exploring the solution space, we can solve enormous, real-world instances that are completely out of reach for standard methods.
Key takeaways
- Model the Real Problem: Many real-world optimization problems are variants of textbook cases. Don't be afraid to add custom constraints (like tray compatibility) to a standard model like Bin Packing to reflect operational reality.
- From Math to Decisions: An optimization model's true value is unlocked when it's used as a decision-making tool. A sensitivity analysis chart is often more powerful for a stakeholder than a single "optimal" number.
- Scale Kills, Advanced Methods Save: Standard MILP models are excellent, but their performance can degrade rapidly with size. For large-scale problems, advanced techniques like Column Generation are not just academic exercises—they are practical necessities for finding high-quality solutions in a reasonable timeframe.
- The CG Mindset: Column Generation fundamentally reframes the problem from "assign items to bins" to "find the best combinations of items (patterns) and then choose from those." This "pattern-based" approach is a powerful paradigm for solving many large-scale combinatorial problems.