Synthetic Linear Programs

A Guide to SyntheticLPs.jl

Each generator turns three knobs — a target variable count, a feasibility status, and a random seed — into a realistic, reproducible linear program. This page pairs a high-level map of the families with the precise formulation, sizing rules, and feasibility tricks behind every generator.

The shared contract

Call generate_problem(sym, target_variables, feasibility_status, seed). All randomness lives in the constructor; build_model is deterministic, so a seed reproduces an instance exactly.

Sizing by variables

target_variables is interpreted per generator — usually by choosing dimensions whose product or sum approximates the requested count. Ranges and group counts often scale with problem size.

Integer relaxation

Several problems are naturally mixed-integer. generate_problem defaults to relax_integer=true, so binary variables are relaxed unless you opt out. Pages describe the intended MIP and note the difference.

feasible — bounds widened so a solution is constructed or likely infeasible — a binding resource is deliberately over-tightened unknown — a realistic random draw with no guarantee

Pick a generator to see its full write-up, or filter from the sidebar.

← All generators
Network & Routing

Load Balancing

Load balancing generates a continuous network LP that minimizes maximum link utilization while routing fixed traffic demands along preselected paths.

Objective: Min-max Continuous variables

Overview

This generator represents network traffic load balancing. A directed network has link capacities and source-target traffic demands. Each demand is assigned a single generated path. The optimization sets link flows and minimizes the maximum utilization factor needed to carry all path demands.

Unlike a full multicommodity flow model, this implementation does not choose paths or split flow. The paths are generated before the model is built, and each link on a demand path must carry at least that demand amount.

Generator Data and Sizing

The constructor documents target_variables as 1 + number of links, matching the model variables u plus one flow variable per link. It sets:

target_links = max(1, target_variables - 1)
n_links = target_links

Network scale parameters depend on target_variables.

For target_variables <= 250:

  • base_nodes: rounded Uniform(5, 20).
  • density_dist: Uniform(0.4, 0.7).
  • capacity_mean: 500.0.
  • capacity_std: 150.0.
  • demand_mean: 50.0.
  • demand_std: 20.0.
  • max_path_length: DiscreteUniform(2, 4).

For 250 < target_variables <= 1000:

  • base_nodes: rounded Uniform(20, 60).
  • density_dist: Uniform(0.25, 0.5).
  • capacity_mean: 2000.0.
  • capacity_std: 600.0.
  • demand_mean: 150.0.
  • demand_std: 60.0.
  • max_path_length: DiscreteUniform(3, 6).

For target_variables > 1000:

  • base_nodes: rounded Uniform(50, 150).
  • density_dist: Uniform(0.15, 0.35).
  • capacity_mean: 8000.0.
  • capacity_std: 2000.0.
  • demand_mean: 500.0.
  • demand_std: 200.0.
  • max_path_length: DiscreteUniform(4, 8).

link_density = rand(density_dist) is sampled but is not used when building links.

Topology generation:

  • All directed non-self links are candidates.
  • If n_links is at least the number of possible links, all possible directed links are used.
  • Otherwise, the generator first builds a bidirectional spanning structure by adding (from_node, to_node) and (to_node, from_node) while connecting all nodes.
  • It then appends random remaining links up to the target count and applies unique.

Capacity generation:

  • min_capacity: truncated normal around capacity_mean * 0.3, lower-bounded at 10.0.
  • max_capacity: min_capacity plus a truncated normal around capacity_mean * 1.2.
  • Each link capacity is sampled from a truncated normal centered between min_capacity and max_capacity.

Demand generation:

  • Possible demands are all ordered source-target node pairs with i != j.
  • A demand ratio is sampled from Uniform(0.3, 0.7).
  • n_demands is that ratio times n_nodes * (n_nodes - 1), rounded and clamped to available pairs.
  • Demand amounts are sampled from a truncated gamma distribution whose support is based on sampled min_demand and max_demand.

Path generation:

  • For each demand, the generator attempts a random walk from source to target with length up to max_path_length.
  • It prefers unvisited outgoing neighbors, but may revisit if needed.
  • If the random walk reaches the target, the path is stored.
  • Otherwise, if a direct link exists, the direct one-link path is stored.
  • If no path is found, the demand is deleted.

The stored struct fields are:

  • n_nodes::Int
  • links::Vector{Tuple{Int,Int}}
  • capacities::Dict{Tuple{Int,Int},Float64}
  • demands::Dict{Tuple{Int,Int},Float64}
  • paths::Dict{Tuple{Int,Int},Vector{Tuple{Int,Int}}}
  • max_utilization::Union{Float64,Nothing}

The constructor calls Random.seed!(seed), so generation is reproducible for the same inputs and resets Julia's global RNG.

LP Formulation

Sets and indices:

  • Directed links a in L.
  • Demands k = (source, target) in K.
  • Generated path P_k subseteq L for each demand with a stored path.

Decision variables:

u >= 0      maximum link utilization
f_a >= 0    flow on link a

Objective:

\[\min u\]

Link utilization constraints:

\[f_a \le u \cdot capacity_a \quad \forall a \in L\]

Demand/path constraints:

\[f_a \ge demand_k \quad \forall k \in K, a \in P_k\]

Optional maximum utilization constraint, used for forced infeasibility:

\[u \le max\_utilization\]

Bounds:

  • u and all link-flow variables are continuous and nonnegative.

Interpretation: u is the largest capacity multiplier needed for any link flow. Each selected path link must be able to carry the full demand amount for every demand whose generated path uses that link.

Feasibility Controls

The constructor sets actual_status = feasibility_status. If feasibility_status == unknown, it randomly chooses feasible with probability 0.7 and infeasible with probability 0.3.

For feasible, no extra cap on u is added. Because u is unbounded above and all capacities are positive, the model can choose a sufficiently large u to satisfy all path lower bounds.

For infeasible, if there are nonempty demands and paths, the generator:

  1. Initializes a minimum required flow per link to zero.
  2. For each demand path, updates each path link's required flow to the maximum demand using that link.
  3. Computes:
       min_u = maximum(required_flow[link] / capacities[link])
  4. Sets max_utilization = min_u * (0.5 + 0.3 * rand()).

The model then enforces u <= max_utilization, which is below the utilization needed by at least one demanded path link.

If the infeasible branch has empty demands or paths, max_utilization remains nothing, so no infeasibility constraint is added.

For unknown, the chosen actual status follows the same feasible or infeasible path above.

Model Characteristics

Variable count:

1 + length(links)

Constraint count drivers:

  • One utilization constraint per link.
  • One lower-bound flow constraint for each pair (demand, link) where the link lies on that demand's generated path.
  • Optional one upper bound on u for forced infeasibility.

The link-flow model is sparse with respect to demand constraints because each demand only constrains links on its generated path. However, link utilization constraints cover every link.

The model is a continuous LP. There are no integer variables.

Practical Notes

This generator is useful for continuous minimax-style network LPs and for testing models with one global utilization variable coupled to many link variables.

The formulation is not a conservation-based routing model: f[link] is a single aggregate flow variable, and each demand path imposes lower bounds on those shared link variables. Multiple demands on the same link do not add in the model; the binding requirement is the maximum lower bound among those constraints, not the sum of all traffic on the link.

For small target_variables, the connectivity-building step can create more links than target_variables - 1 because it first adds bidirectional tree links for all sampled nodes. In that case the final variable count can exceed the target. The sampled link_density is currently unused.

← All generators
Network & Routing

Multi-Commodity Flow

The multi-commodity flow generator creates continuous minimum-cost routing LPs in which several commodities share capacities on the same directed network.

Objective: Min Continuous variables

Overview

This generator represents shared infrastructure planning: different freight classes, products, or traffic demands each need to move from their own source to their own sink, while all commodities compete for capacity on common arcs. The model minimizes total routing cost subject to arc capacity limits and per-commodity flow conservation.

Generator Data and Sizing

target_variables maps to the product of commodities and arcs:

target_variables ~= n_commodities * n_arcs

The constructor seeds Julia's global RNG with Random.seed!(seed). sample_parameters_mcf also calls Random.seed!(seed), so parameter selection and subsequent data generation are reproducible for a given input.

Parameter selection searches for a realistic combination of commodity count, node count, and arc count. If no combination gets within 10% relative error, it falls back to a heuristic using a random commodity count and target density.

Target variablesCommoditiesNodesDensity bandCapacity rangeDemand rangeCost rangeUtilization
<= 1002:55:150.2 to 0.620.0 to 200.05.0 to 50.01.0 to 15.00.6 to 0.8
<= 5003:1510:300.15 to 0.550.0 to 500.010.0 to 100.01.0 to 25.00.5 to 0.7
> 5008:5015:1000.1 to 0.4100.0 to 2000.020.0 to 500.01.0 to 50.00.4 to 0.6

The generated network starts with a directed cycle through all nodes, which gives every node reachability around the cycle. It then adds a limited number of reverse cycle arcs, shortcut arcs, and finally shuffled random directed arcs without self-loops until the target arc count is reached. n_arcs is capped by n_nodes * (n_nodes - 1), but if the requested arc count is below n_nodes, the initial cycle can still produce more actual arcs than n_arcs.

Arc capacities are sampled from a log-normal distribution centered on the geometric mean of the capacity range, clamped to the range, and rounded to 2 digits. Demands are similarly log-normal over the demand range and rounded to 2 digits. Costs are sampled uniformly from the cost range and multiplied by a congestion factor from 1.0 to about 1.3, where lower-capacity arcs become more expensive.

Commodity pairs are generated as source-sink pairs with a 40% short-haul and 60% long-haul pattern, avoiding duplicate pairs during the first 100 attempts per commodity. If uniqueness attempts fail, additional non-identical source and sink pairs are added without checking duplication.

The struct stores:

  • n_nodes::Int
  • n_arcs::Int
  • n_commodities::Int
  • arcs::Vector{Tuple{Int,Int}}
  • capacities::Dict{Tuple{Int,Int},Float64}
  • demands::Dict{Int,Float64}
  • costs::Dict{Tuple{Int,Int},Float64}
  • commodities::Vector{Tuple{Int,Int}}

LP Formulation

Sets:

  • V = {1, ..., n_nodes}
  • A = arcs
  • K = {1, ..., n_commodities}
  • commodity k has source s_k, sink t_k, and demand d_k

Decision variable:

flow[k,a] >= 0 = flow of commodity k on arc a

Objective:

minimize sum_{a in A} costs[a] * sum_{k in K} flow[k,a]

Shared arc capacities:

sum_{k in K} flow[k,a] <= capacities[a]    for each a in A

Flow conservation and demand satisfaction:

outflow(k,s_k) - inflow(k,s_k) = d_k
inflow(k,t_k) - outflow(k,t_k) = d_k
inflow(k,v) = outflow(k,v)                 for v not in {s_k,t_k}

All variables are continuous and nonnegative. There are no commodity-specific arc eligibility restrictions; every commodity can use every generated arc.

Feasibility Controls

The constructor maps statuses to internal symbols: feasible becomes :feasible, infeasible becomes :infeasible, and unknown becomes :all.

  • feasible: computes total demand and total capacity. If total capacity is below total_demand / capacity_utilization, all capacities are scaled up proportionally. It then checks reachability for each commodity and adds a direct or two-hop path if needed, assigning new arcs random capacities and costs.
  • infeasible: first either scales all capacities down to 30% to 70% of their current values or scales all demands up to 150% to 250%. It then calls enforce_infeasibility_mcf!, which selects up to three commodities and reduces either total outgoing capacity at the commodity source or total incoming capacity at the commodity sink below that commodity's demand. If targeted bottlenecks fail, it inflates one demand beyond total network capacity.
  • unknown: leaves the sampled data in its natural random state, except for the base strongly connected topology.

The feasible path is based on aggregate capacity scaling plus reachability checks. It is intended to improve feasibility but does not solve a multi-commodity feasibility LP. The infeasible path is stronger because a source or sink cut with total capacity below a commodity's demand is a direct certificate of infeasibility.

Model Characteristics

The intended variable count is n_commodities * length(arcs). The struct's n_arcs field records the target/capped arc count, while length(arcs) is the actual number of arcs used by the model. The actual count can differ when the initial connectivity construction adds more arcs or feasible repair adds paths.

Constraint counts are driven by:

  • one shared capacity constraint per actual arc
  • one flow-balance constraint for every commodity-node pair

The resulting matrix is sparse but larger than single-commodity flow: each commodity-arc variable appears in one shared capacity row and in two node-balance rows for that commodity. The model is a continuous LP relaxation of a routing problem; it does not enforce unsplittable flows or integer path choices.

Practical Notes

These instances are useful for testing large sparse LPs with coupling constraints, because arc capacities couple otherwise separate commodity flows. They also exercise denser conservation structure than single-commodity flow. The main caveat is that the feasible generator uses sufficient-looking heuristics rather than optimization-based verification, so generated feasible data should be interpreted as constructed-to-be-feasible rather than formally certified by the constructor.

← All generators
Network & Routing

Network Flow

The network flow generator creates continuous single-commodity flow LPs over a directed graph, either maximizing source outflow or minimizing cost for a required flow.

Objective: Min/Max Continuous variables

Overview

This generator represents routing through a capacitated directed network. Node 1 is the source, the last node is the sink, arcs have capacities and per-unit costs, and intermediate nodes conserve flow. Depending on the sampled objective type, the model either sends as much flow as possible from source to sink or sends a specified target amount at minimum cost.

Generator Data and Sizing

target_variables maps to the number of directed arcs:

target_variables ~= length(arcs)

The constructor seeds Julia's global RNG with Random.seed!(seed). The same target, feasibility status, and seed reproduce the same graph, capacities, costs, objective choice, and target flow.

Problem scale controls node search ranges, density, capacities, and costs:

Target variablesNode search rangeTarget densityCapacity rangeCost range
<= 1004:150.410.0 to 100.01.0 to 10.0
<= 50010:350.210.0 to 500.01.0 to 25.0
> 50020:1000.150.0 to 2000.01.0 to 50.0

The node count is the first value whose rounded dense-arc estimate reaches at least 90% of the target. source_node is always 1 and sink_node is always n_nodes.

Topology generation first creates the path (1,2), (2,3), ..., (n_nodes-1,n_nodes) to ensure source-to-sink connectivity. It then optionally adds arcs from the source to intermediate nodes and from intermediate nodes to the sink with probability 0.3 each, followed by shuffled random directed arcs without self-loops until the requested arc count is reached. If the path alone already exceeds target_variables, the returned graph can have more arcs than the target.

Each arc gets:

  • capacity sampled uniformly from the scale-specific floating-point interval and rounded to 2 digits
  • cost sampled uniformly from the scale-specific floating-point interval and rounded to 2 digits

The struct stores:

  • n_nodes::Int
  • source_node::Int
  • sink_node::Int
  • arcs::Vector{Tuple{Int,Int}}
  • capacities::Dict{Tuple{Int,Int},Float64}
  • costs::Dict{Tuple{Int,Int},Float64}
  • flow_objective::Symbol, either :max_flow or :min_cost
  • target_flow::Union{Float64,Nothing}

Objective selection depends on target size: targets <= 100 choose :max_flow with probability 0.7; larger targets choose :max_flow with probability 0.4. The remaining cases use :min_cost.

LP Formulation

Sets:

  • V = {1, ..., n_nodes}
  • A = arcs
  • s = source_node
  • t = sink_node

Decision variable:

flow[a] >= 0 = flow on directed arc a

Capacity bounds are modeled as constraints:

flow[i,j] <= capacities[(i,j)]    for each (i,j) in A

Intermediate-node flow conservation:

sum_{(i,v) in A} flow[i,v] = sum_{(v,j) in A} flow[v,j]
    for each v in V \ {s,t}

For :max_flow, the objective is:

maximize sum_{(s,j) in A} flow[s,j]

For :min_cost, the objective and source-flow requirement are:

minimize sum_{a in A} costs[a] * flow[a]
sum_{(s,j) in A} flow[s,j] = target_flow

The model does not add a corresponding sink inflow requirement. Conservation at intermediate nodes plus the graph structure usually forces flow to terminate at the sink when there are no useful cycles, but cycles and arcs into the source or out of the sink are allowed by topology generation.

Feasibility Controls

The generator estimates maximum available flow as the sum of capacities on arcs leaving the source. This is an upper bound, not an exact max-flow solve.

  • feasible: for :min_cost, sets target_flow to 10% to 40% of the source-out capacity estimate. For :max_flow, no extra requirement is added.
  • infeasible: if the sampled objective is :min_cost, sets target_flow to 120% to 170% of the source-out capacity estimate. If the sampled objective is :max_flow, the constructor switches to :min_cost and uses the same impossible target rule.
  • unknown: randomly maps to feasible with probability 0.7 or infeasible with probability 0.3 before applying the rules above.

Because the target-flow logic is based only on source-out capacity, feasible :min_cost instances are intended to be feasible but are not verified by an exact max-flow computation. Infeasible target-flow instances are guaranteed with respect to the model's source-out equality and source capacity constraints when the estimate is positive.

Model Characteristics

The model has one continuous nonnegative variable per arc. It has one capacity constraint per arc, plus one conservation constraint for each intermediate node with incident arcs. A :min_cost instance with a target flow adds one source-flow equality.

The matrix is sparse: each arc variable appears in its own capacity row and in up to two conservation rows, plus possibly the source-flow row. The formulation is continuous. Network flow LPs often have integral extreme points under integral capacities for standard source-sink formulations, but this generator does not declare integer variables.

Practical Notes

These instances are useful for testing sparse network LP behavior, objective switching, and capacity-conservation structure. The implementation permits arbitrary directed arcs, including arcs into the source, arcs out of the sink, and cycles. The max_flow_estimate is deliberately simple, so it should be treated as a data-generation heuristic rather than a certified graph-theoretic max flow.

← All generators
Network & Routing

Telecom Network Design

Generates multicommodity telecom network design instances with binary link installation, continuous routed traffic, capacities, and an installation budget.

Objective: Min Mixed variables

Overview

This generator represents telecommunications network planning. It creates a geographic network of potential undirected physical links, directed routing arcs in both directions, and traffic commodities with source, sink, and demand. The model chooses which physical links to install and how to route every commodity while minimizing installation and routing cost.

Generator Data and Sizing

target_variables is interpreted as:

n_arcs * (2 * n_commodities + 1)

For each physical arc, the model has one installation variable and one flow variable for each commodity in each of the two directions.

Scale-dependent ranges:

Scale conditionNodesPhysical arcsCommodities
target_variables <= 1004-125-253-15
target_variables <= 10008-3515-12010-80
otherwise20-12050-60020-300

The constructor searches over n_arcs and the implied n_commodities. If no combination is within 10% error, it uses a square-root heuristic. Node count is then derived from arc count using a random density factor between 1.5 and 2.5 arcs per node, subject to scale bounds and complete-graph feasibility.

Scale-dependent parameter ranges:

ScaleGridBase install costCost/kmFlow cost/unitDemand rangeBudget factorCapacity modules
small100-50010000-5000050-1500.005-0.02sampled 1-5 to 50-1500.4-0.7155, 622, 2488
medium500-200030000-10000080-2000.01-0.05sampled 5-15 to 100-3000.5-0.8155, 622, 2488, 9953
large2000-10000100000-500000150-5000.02-0.1sampled 10-30 to 200-10000.6-0.9622, 2488, 9953, 39813

Random data generation:

  • Node locations are clustered around max(1, div(n_nodes, 3)) random centers with normal offsets and clamping to the grid.
  • Topology starts with a nearest-neighbor spanning tree to ensure connectivity, then adds mostly short arcs, then random remaining arcs if needed.
  • Physical arcs are stored canonically with i < j; directed_arcs contains both (i, j) and (j, i).
  • Distances are Euclidean and stored for both directions.
  • Link capacities are selected from optical capacity modules. Longer links bias toward larger modules.
  • Installation costs equal a base term adjusted by capacity module plus distance cost with 10% random noise.
  • Flow costs are proportional to distance and stored for both directions.
  • Commodities are random source-sink pairs. Hub-to-hub traffic, where both endpoints are among the first quarter of nodes, receives higher log-normal demand.
  • Initial budget is sum(installation_costs) * budget_factor.

The stored struct fields are:

  • n_nodes
  • n_arcs
  • n_commodities
  • arcs
  • directed_arcs
  • node_locations
  • distances
  • installation_costs
  • link_capacities
  • flow_costs
  • commodities
  • budget
  • outgoing_arcs
  • incoming_arcs

The constructor calls Random.seed!(seed), so generation is reproducible for a fixed seed but resets Julia's global RNG state.

LP Formulation

Sets and indices:

  • N = {1, ..., n_nodes}: network nodes.
  • A: physical undirected arcs, stored as canonical tuples.
  • D: directed arcs, containing both directions for each physical arc.
  • K = {1, ..., n_commodities}: traffic commodities.

For a commodity k, let s_k be its source, t_k its sink, and d_k its demand.

Decision variables:

y_a in {0, 1}          for a in A
f_{k,u,v} >= 0         for k in K, (u, v) in D

y_a = 1 means physical link a is installed. f_{k,u,v} is flow of commodity k on directed arc (u, v).

Objective:

minimize
    sum_{a in A} installation_cost_a y_a
  + sum_{(u,v) in D} flow_cost_{u,v} sum_{k in K} f_{k,u,v}

Flow conservation:

sum_{(n,j) in outgoing(n)} f_{k,n,j}
  - sum_{(i,n) in incoming(n)} f_{k,i,n}
    = d_k       if n = s_k
    = -d_k      if n = t_k
    = 0         otherwise

Capacity on each physical link a = (i, j):

sum_{k in K} (f_{k,i,j} + f_{k,j,i}) <= link_capacity_a y_a

Budget:

sum_{a in A} installation_cost_a y_a <= budget

Bounds:

y_a binary
f_{k,u,v} >= 0

At the package API level, generate_problem(...; relax_integer=true) is the default, so installation variables are relaxed unless the caller sets relax_integer=false.

Feasibility Controls

  • feasible: The generator first checks whether all demands can be greedily routed on the full network using can_route_demands. If not, it repeatedly scales link capacities up by factors up to a maximum cumulative scale of 50, with one final doubling attempt. It then greedily selects a high capacity-per-cost subset of links, first building connectivity and then adding links until can_route_demands succeeds. Budget is set to at least the selected-link installation cost times 1.05 to 1.20.
  • infeasible: The generator tightens budget below a greedy spanning-tree cost estimate, using a multiplier from 0.45 to 0.75. It also targets the busiest source and sink nodes and reduces incident link capacities to only 25% to 60% of their associated traffic. If routing still appears possible, it globally halves capacities up to three times, then may directly cap the busiest source's incident capacity to about 10% of outgoing demand.
  • unknown: No explicit feasibility adjustment is applied; the original budget and sampled capacities are used.

The feasibility checks are constructive heuristics, not calls to the final JuMP solver. They route commodities one at a time along shortest paths with remaining undirected capacity.

Model Characteristics

  • Variables: n_arcs installation variables plus 2 * n_arcs * n_commodities flow variables.
  • Constraints: n_commodities * n_nodes flow-conservation equalities, n_arcs capacity constraints, and one budget constraint.
  • Density: flow-conservation rows are sparse network incidence rows; capacity rows touch both directions for every commodity on one physical link plus the corresponding install variable.
  • Intended model class: mixed-integer multicommodity network design.
  • Default generated LP: with the package default relax_integer=true, installation decisions become continuous in [0, 1], producing the LP relaxation.

Practical Notes

These instances are useful for testing large sparse network matrices, multicommodity flow structure, and fixed-charge design relaxations. The helper can_route_demands is greedy and capacity-based, so it is a generation heuristic rather than a proof procedure for all cases. The generated physical network is undirected for capacity and installation, but flow variables are directed.

← All generators
Network & Routing

Transportation

The transportation generator creates continuous minimum-cost shipping LPs that move goods from supply sources to demand destinations.

Objective: Min Continuous variables

Overview

This generator represents a logistics planning problem: each source has a finite supply of a homogeneous good, each destination has a demand requirement, and the model chooses shipment quantities on all source-destination lanes to minimize total transportation cost. It is the classic balanced-or-unbalanced transportation structure, but the implementation uses <= supply constraints and >= demand constraints, so feasible instances may leave unused supply.

Generator Data and Sizing

target_variables maps to the lane count:

target_variables ~= n_sources * n_destinations

The constructor seeds Julia's global RNG with Random.seed!(seed), so the same inputs reproduce the same dimensions, ranges, and data.

Dimensions are chosen by starting near sqrt(target_variables), multiplying the source side by a random ratio in [0.5, 1.5), and setting destinations from target_variables / n_sources. Both dimensions are at least 2. A final adjustment tries to keep the product within about 10% of the target.

Generated ranges depend on the final lane count:

Lane countSupply range endpointsDemand range endpointsCost range endpoints
<= 250low from 50:100, high from 200:500low from 30:80, high from 150:300low from 5:15, high from 25:60
<= 1000low from 100:500, high from 1000:5000low from 80:300, high from 800:3000low from 10:30, high from 50:150
> 1000low from 500:2000, high from 5000:50000low from 300:1500, high from 3000:30000low from 20:100, high from 100:500

After each range is sampled, supplies, demands, and costs are drawn uniformly from the resulting integer intervals. The struct stores:

  • n_sources::Int
  • n_destinations::Int
  • supplies::Vector{Int}
  • demands::Vector{Int}
  • costs::Matrix{Int}

LP Formulation

Sets:

  • I = {1, ..., n_sources} for sources
  • J = {1, ..., n_destinations} for destinations

Decision variable:

x[i,j] >= 0 = amount shipped from source i to destination j

Objective:

minimize sum_{i in I, j in J} costs[i,j] * x[i,j]

Constraints:

sum_{j in J} x[i,j] <= supplies[i]    for each source i
sum_{i in I} x[i,j] >= demands[j]     for each destination j

The source constraints limit outbound shipments by available supply. The destination constraints require every destination's demand to be met or exceeded. There are no upper bounds on individual lanes beyond the source totals.

Feasibility Controls

The constructor first samples supplies and demands, then compares total supply and total demand.

  • feasible: if sum(supplies) < sum(demands), it adds the shortage across the supply vector using random weights and integer rounding. This guarantees total_supply >= total_demand, which is sufficient for this complete bipartite transportation model.
  • infeasible: it creates an aggregate shortage by ensuring total_demand > total_supply with a random margin equal to 2% to 10% of total supply. If demand is not already high enough, it distributes the missing amount across demands.
  • unknown: leaves the sampled supplies and demands unchanged.

The helper distribute_additions! uses random weights, floors proportional additions, and distributes any integer remainder over a random permutation of entries.

Model Characteristics

The model has n_sources * n_destinations continuous nonnegative variables. It has n_sources + n_destinations structural constraints. The coefficient matrix is sparse in the usual transportation sense: each variable appears in exactly one source constraint and one destination constraint, plus the objective.

There are no integer declarations. Although transportation problems often have integral extreme points when supplies and demands are integral, this implementation builds the continuous LP directly.

Practical Notes

These instances are useful for testing LP solvers on highly structured network-like matrices with predictable row and column sparsity. The generator uses a complete lane set, so it does not model missing routes, lane capacities, or transshipment. Feasibility is controlled only through aggregate supply and demand, which is enough because every source can ship to every destination.

← All generators
Facility & Supply Chain

Facility Location

Generates capacitated facility location instances with fixed opening costs, customer demand, shipping costs, and an opening budget.

Objective: Min Mixed variables

Overview

This generator represents a distribution network design problem. A planner chooses which candidate facilities to open and how much demand to serve from each open facility. The model minimizes fixed facility costs plus customer shipping costs while satisfying all customer demand, respecting facility capacities, and staying within an opening budget.

Generator Data and Sizing

target_variables is interpreted as:

n_facilities * (n_customers + 1)

This matches one open variable per facility plus one shipping variable for every facility-customer pair.

Scale-dependent ranges:

Scale conditionFacilitiesCustomersGrid sizeTransport cost/kmCapacity factorBudget factor
target_variables <= 1002-201-40200-800 by 200-8000.5-1.21.1-1.60.4-0.8
target_variables <= 10003-1005-200500-2000 by 500-20000.8-1.81.2-1.80.5-0.9
otherwise5-50010-20001000-5000 by 1000-50001.0-3.01.3-2.00.6-0.95

The constructor searches over facility counts and chooses the customer count with the lowest relative variable-count error. If no combination is within 10%, it uses a square-root heuristic.

Random data generation:

  • Facility locations are uniform on the sampled rectangle.
  • Customer locations are clustered around max(2, div(n_customers, 20)) random centers with normal offsets and clamping to the grid.
  • Customer demands are log-normal around the midpoint of sampled min/max demand ranges.
  • Average facility capacity is (total_demand / n_facilities) * capacity_factor.
  • Individual capacities vary by a factor in [0.8, 1.2].
  • Fixed costs depend on sampled fixed-cost ranges, relative capacity, and a location factor.
  • Shipping costs are Euclidean distance times sampled transport cost per km times random noise in [0.9, 1.1].
  • Initial budget is sum(fixed_costs) * budget_factor.

The stored struct fields are:

  • n_facilities
  • n_customers
  • facility_locs
  • customer_locs
  • demands
  • fixed_costs
  • capacities
  • shipping_costs
  • budget

The constructor calls Random.seed!(seed), so generation is reproducible for a fixed seed but resets Julia's global RNG state.

LP Formulation

Sets and indices:

  • W = {1, ..., n_facilities}: candidate facilities.
  • C = {1, ..., n_customers}: customers.

Decision variables:

y_w in {0, 1}
x_{w,c} >= 0

y_w = 1 means facility w is opened. x_{w,c} is the amount shipped from facility w to customer c.

Objective:

minimize
    sum_{w in W} fixed_cost_w y_w
  + sum_{w in W} sum_{c in C} shipping_cost_{w,c} x_{w,c}

Constraints:

Demand:

sum_{w in W} x_{w,c} >= demand_c    for each c in C

Facility capacity:

sum_{c in C} x_{w,c} <= capacity_w y_w    for each w in W

Opening budget:

sum_{w in W} fixed_cost_w y_w <= budget

Bounds:

y_w binary
x_{w,c} >= 0

At the package API level, generate_problem(...; relax_integer=true) is the default, so the binary open variables are relaxed unless the caller sets relax_integer=false.

Feasibility Controls

  • feasible: If total capacity is below total demand, all capacities are scaled so total capacity is at least 1.05 * total_demand. Facilities are sorted by capacity-per-fixed-cost ratio. A greedy integer subset is selected until total demand can be covered, and budget is set to at least that subset cost times a random slack factor from 1.02 to 1.25.
  • infeasible: The same ratio ordering is used to compute a fractional lower bound on the minimum opening cost needed to reach total demand capacity. If the bound is finite, budget is tightened to 75% to 95% of that threshold. If the bound is infinite, budget is set below a fraction of total fixed cost.
  • unknown: The original sampled budget is retained without capacity scaling or budget tightening guarantees.

Model Characteristics

  • Variables: n_facilities * n_customers shipment variables plus n_facilities open variables.
  • Constraints: n_customers demand constraints, n_facilities capacity constraints, and one budget constraint.
  • Density: demand rows touch all facilities for one customer; capacity rows touch all customer shipments for one facility plus that facility's open variable; the budget row touches only open variables.
  • Intended model class: mixed-integer capacitated facility location.
  • Default generated LP: with the package default relax_integer=true, facility open decisions become continuous in [0, 1], yielding a capacitated facility-location relaxation.

Practical Notes

The model allows overserving customers because demand constraints are >=, not equality. Shipping variables represent quantities rather than assignment fractions. The infeasible mode is designed around a fractional capacity-cost threshold, so it is especially relevant to the default LP relaxation as well as to the integer model.

← All generators
Facility & Supply Chain

Supply Chain

SupplyChainProblem generates a facility-location and transportation model with geographic customer clusters, mode-specific route availability, fixed facility-opening costs, and capacity constraints.

Objective: Min Mixed variables

Overview

This generator represents strategic supply chain network design. A planner decides which facilities to open and how much demand to ship from open facilities to customers over available transportation modes. Costs include fixed facility costs and distance-based transportation costs.

Generator Data and Sizing

The code comments describe variables as approximately facility-open variables plus shipment variables:

n_facilities + n_facilities * n_customers * n_transport_modes * infrastructure_density

In practice, target_variables selects one of three size regimes rather than solving dimensions to match the target exactly:

  • target_variables <= 250: n_facilities from 3:8, n_customers from 15:35, modes from 1:2, grid dimensions 200-800, infrastructure density roughly 0.7-1.0, clustering factor roughly 0.25-0.85.
  • target_variables <= 1000: n_facilities from 6:18, n_customers from 25:65, modes from 2:3, grid dimensions 800-2000, infrastructure density roughly 0.5-0.9, clustering factor roughly 0.2-0.7.
  • larger targets: n_facilities from 12:40, n_customers from 60:200, modes from 3:4, grid dimensions 2000-5000, infrastructure density roughly 0.4-0.8, clustering factor roughly 0.15-0.55.

Transport modes are sampled from truck, rail, ship, and air. Base transport costs use gamma distributions: truck Gamma(4, 0.25), rail Gamma(3, 0.2), ship Gamma(2, 0.15), air Gamma(6, 0.5).

Customers are drawn around cluster centers. Cluster weights come from a Dirichlet distribution. Facilities are more dispersed, but 40 percent are placed near a cluster center. Fixed facility costs are lognormal-scale costs correlated with market potential and location. Customer demands are lognormal multipliers applied to cluster-influenced base demand. Capacities are based on total demand per facility, a random capacity factor 1.2-2.2, relative fixed cost, and a gamma multiplier.

Route availability depends on mode:

  • Truck is almost always available before density scaling, with base probability 0.98.
  • Rail availability increases with distance, capped at 0.8.
  • Ship is available only when either endpoint lies near the bottom grid edge, otherwise probability 0.
  • Air is more likely for long distances.

Mode capacities are generated as total-demand fractions using mode_capacity_factor from 0.25-0.65 and mode-specific gamma multipliers.

The struct stores:

  • n_facilities, n_customers
  • transport_modes
  • facility_locs, customer_locs
  • cluster_centers, cluster_weights
  • fixed_costs, demands, capacities
  • transport_costs, keyed by (facility, customer, mode) only for available routes
  • mode_capacities
  • total_demand

The constructor calls Random.seed!(seed), resetting Julia's global RNG.

LP Formulation

The implemented model is a mixed-integer linear model, not a pure LP, because facility-open variables are binary.

Sets:

  • F = {1, ..., n_facilities} facilities
  • C = {1, ..., n_customers} customers
  • M selected transport modes
  • A = {(f,c,m): transport_costs has key (f,c,m)} available route-mode arcs

Decision variables:

  • y_f in {0,1}: whether facility f is open
  • x_{fcm} >= 0: shipment from facility f to customer c by mode m on available arcs

Objective:

\[\min \sum_{f \in F} F_f y_f + \sum_{(f,c,m) \in A} q_{fcm} x_{fcm}\]

Customer demand satisfaction:

\[\sum_{(f,c,m) \in A: c \text{ fixed}} x_{fcm} \ge d_c \quad \forall c \in C\]

Facility capacity gated by opening:

\[\sum_{(f,c,m) \in A: f \text{ fixed}} x_{fcm} \le K_f y_f \quad \forall f \in F\]

Mode capacity:

\[\sum_{(f,c,m) \in A: m \text{ fixed}} x_{fcm} \le U_m \quad \forall m \in M\]

Feasibility Controls

For feasible, the generator strengthens route connectivity and capacities after initial random data. It selects a fallback mode, preferring truck when present. For each customer, it ensures arcs from the K nearest facilities exist in the fallback mode, where K = min(max(3, ceil(n_facilities / 3)), n_facilities). It then computes an approximate demand share for each facility from those nearest-facility links and raises capacities to at least 1.05 times that share. It also ensures the fallback mode can carry all demand with 5 percent slack, scales aggregate mode capacity if needed, and scales aggregate facility capacity if needed.

For infeasible, it creates a transport-capacity shortfall. It picks a desired total mode capacity ratio from Uniform(0.7, 0.95) times total demand and scales mode capacities downward if the current total exceeds that value. Since all demand must ship through modes, aggregate mode capacity below total demand makes the model infeasible.

For unknown, no special feasibility branch is applied. The randomly generated route availability, facility capacity, and mode capacity data are returned as generated and may be feasible or infeasible.

Model Characteristics

The model has n_facilities binary variables plus one continuous shipment variable per available route-mode combination. Constraint count is n_customers + n_facilities + length(transport_modes). Shipment variables appear in demand, facility, and mode rows. Route density is controlled by infrastructure availability and can be much less than the full facility-customer-mode product, especially for ship and air.

Because y is binary, this is a MILP. The LP relaxation would replace y_f in {0,1} with 0 <= y_f <= 1, but the implementation uses JuMP Bin.

Practical Notes

This generator is useful for network design, sparse transportation arcs, fixed-charge facility opening, and aggregate-capacity infeasibility. target_variables is only a regime selector; actual variable count can vary substantially with random dimensions and route density. The local variable avg_density is defined but not used in the constructor's sizing logic.

← All generators
Blending & Diet

Blending

BlendingProblem generates a continuous least-cost mixture model whose ingredients must satisfy quality bands, supply limits, and optional usage rules.

Objective: Min Continuous variables

Overview

This generator represents industrial blending problems such as fuel, chemical, food, or material mixing. A planner chooses nonnegative quantities of available ingredients, pays ingredient costs, and must produce at least a target blend amount while keeping the weighted-average quality attributes inside acceptable lower and upper bands.

Generator Data and Sizing

target_variables maps directly to the number of ingredient variables:

n_ingredients = max(3, min(500, target_variables))

The generator draws n_attributes uniformly from 2:15 and min_blend_amount uniformly from integer values 100:20000, converted to Float64. Ingredient costs are integer values from 10:100. Attribute values are sampled on a 0.01 grid from 0.1:0.01:0.9, producing an n_ingredients x n_attributes matrix.

The struct stores:

  • n_ingredients, n_attributes
  • costs::Vector{Int}
  • attributes::Matrix{Float64}, indexed by ingredient and attribute
  • lower_bounds, upper_bounds for average quality attributes
  • supply_limits, with Inf meaning no explicit limit
  • cost_budget, with Inf meaning no explicit budget
  • min_blend_amount
  • min_usage_required::Dict{Int,Float64}
  • max_usage_limits::Dict{Int,Float64}

The constructor calls Random.seed!(seed), so it resets Julia's global RNG. With the same seed, target size, and feasibility status, the generated instance is reproducible in the same Julia/runtime environment.

LP Formulation

Sets:

  • I = {1, ..., n_ingredients} ingredients
  • A = {1, ..., n_attributes} quality attributes

Decision variable:

  • x_i >= 0: amount of ingredient i used in the blend

Objective:

\[\min \sum_{i \in I} c_i x_i\]

Constraints:

Minimum production:

\[\sum_{i \in I} x_i \ge B\]

Finite ingredient supply limits:

\[x_i \le s_i \quad \text{for finite } s_i\]

Finite cost budget:

\[\sum_{i \in I} c_i x_i \le C\]

Optional ingredient minimum and maximum usage rules:

\[x_i \ge m_i\]
\[x_i \le u_i\]

Quality bands are written as linear weighted-average constraints:

\[\sum_{i \in I} a_{ij} x_i \ge L_j \sum_{i \in I} x_i \quad \forall j \in A\]
\[\sum_{i \in I} a_{ij} x_i \le U_j \sum_{i \in I} x_i \quad \forall j \in A\]

The model is a continuous LP. No integer or binary variables are used.

Feasibility Controls

The constructor translates feasibility_status into an internal target:

  • feasible generates a constructed baseline blend and sets constraints around it.
  • infeasible selects one of four infeasibility scenarios.
  • unknown randomly chooses feasible or infeasible with probability 0.5 each.

For feasible instances, the generator builds a baseline solution. It scores ingredients by average quality divided by cost, assigns about 60 percent of ingredients as primary ingredients, allocates 80 percent of the blend amount across primaries and 20 percent across secondaries, then rescales to exactly min_blend_amount. Achieved average qualities from that blend are used to construct tight quality bands. The tolerance level is drawn from one of three regimes: roughly 0.025-0.05, 0.04-0.07, or 0.06-0.08. Supply limits are set above the baseline amounts, with tighter slack on critical primary ingredients. The cost budget is set above actual baseline cost by about 6-25 percent. Minimum usage rules are added for about one quarter of ingredients at 70-90 percent of baseline usage, and maximum usage rules for about one third of ingredients at 120-150 percent of baseline usage.

For infeasible instances, the generator chooses one of four cases:

  • Supply shortage conflict: quality lower bounds are set near the best available values, but supply for critical high-quality ingredients is limited to a small total share of the required blend.
  • Budget impossibility: quality lower bounds are set near best qualities and the budget is set to only 60-90 percent of a lower-bound cost estimate.
  • Impossible quality conflict: up to three attributes receive lower bounds near the maximum observed attribute value and very tight upper bounds.
  • Over-constrained system: early attributes are constrained near maxima, other attributes near midpoints, expensive ingredients are tightly supply-limited, and the budget is set to about 90 percent of average-cost production.

The infeasible cases are designed to create contradictions, but they are heuristic rather than solver-certified at construction time.

Model Characteristics

Variable count is exactly n_ingredients, capped at 500 and floored at 3. The base constraints include one production constraint, two quality constraints per attribute, finite supply constraints, a finite budget constraint, and optional usage bounds. Feasible instances usually have all supply limits finite, one finite budget, and many optional usage rules. The quality constraints are dense across all ingredient variables. Supply and usage constraints are one-variable sparse rows.

The model is continuous. It represents divisible ingredient amounts, not lot-sized procurement or discrete batches.

Practical Notes

This generator is useful for testing dense weighted-average constraints, cost-budget interactions, and feasibility diagnostics in continuous mixture LPs. The quality constraints are linearized by multiplying the average bounds by total blend amount, so they remain LP constraints. unknown does not mean unconstrained random data; it means a random choice between the same feasible and infeasible construction paths. The constructor uses the global RNG, which can affect or be affected by surrounding randomized code.

← All generators
Blending & Diet

Diet Problem

The diet problem generator creates continuous minimum-cost food selection LPs with nutrient requirements, supply limits, budget limits, and optional food-specific consumption bounds.

Objective: Min Continuous variables

Overview

This generator represents nutrition planning under market and dietary restrictions. Foods have costs and nutrient contents; the model chooses nonnegative consumption quantities that meet nutrient minimums while minimizing total cost. Generated instances may also include food supply ceilings, an overall cost budget, minimum consumption requirements for preferred foods, and maximum consumption limits for restricted foods.

Generator Data and Sizing

target_variables maps directly to the number of foods:

target_variables = n_foods

The constructor seeds Julia's global RNG with Random.seed!(seed), so foods, nutrients, constraints, and infeasibility scenarios are reproducible for the same inputs.

The number of nutrients and data ranges scale with target size:

Target variablesNutrientsCost endpoint rangesNutrient endpoint ranges
<= 1005:min(25, max(5, target_variables / 4))low from 0.5:0.1:2.0, high from 3.0:0.5:8.0low from 0.05:0.01:0.15, high from 1.5:0.1:3.0
<= 100015:min(75, max(15, target_variables / 8))low from 0.1:0.05:1.0, high from 2.0:0.5:10.0low from 0.01:0.005:0.1, high from 1.0:0.2:4.0
> 100025:min(150, max(25, target_variables / 15))low from 0.05:0.01:0.5, high from 1.0:0.2:15.0low from 0.005:0.001:0.05, high from 0.5:0.1:5.0

Costs are sampled as rand(min_cost:0.1:max_cost, n_foods). Nutrient contents are sampled as rand(min_nutrient:0.1:max_nutrient, n_foods, n_nutrients). Requirements start at zero, supply limits and cost budget start at Inf, and food-specific min/max dictionaries start empty before feasibility logic fills them.

The struct stores:

  • n_foods::Int
  • n_nutrients::Int
  • costs::Vector{Float64}
  • nutrient_content::Matrix{Float64}
  • requirements::Vector{Float64}
  • food_supply_limits::Vector{Float64}
  • cost_budget::Float64
  • min_food_amounts::Dict{Int, Float64}
  • max_food_amounts::Dict{Int, Float64}

LP Formulation

Sets:

  • F = {1, ..., n_foods} for foods
  • N = {1, ..., n_nutrients} for nutrients

Decision variable:

x[i] >= 0 = amount of food i consumed

Objective:

minimize sum_{i in F} costs[i] * x[i]

Nutrient requirements:

sum_{i in F} nutrient_content[i,j] * x[i] >= requirements[j]
    for each nutrient j

Supply limits, when finite:

x[i] <= food_supply_limits[i]

Budget limit, when finite:

sum_{i in F} costs[i] * x[i] <= cost_budget

Food-specific consumption restrictions:

x[i] >= min_food_amounts[i]    for listed foods
x[i] <= max_food_amounts[i]    for listed foods

All variables are continuous and nonnegative.

Feasibility Controls

unknown is first mapped randomly to feasible with probability 0.75 or infeasible with probability 0.25. The selected actual status controls the rest of generation.

Feasible

The feasible path constructs a baseline diet rather than solving a verification LP.

  1. It scores foods by average nutrient content divided by cost.
  2. It allocates 75% of a 100-unit baseline diet across the top 60% of foods by cost-effectiveness, weighted by effectiveness, and spreads the remaining 25% across the rest.
  3. It computes achieved nutrient levels from that baseline diet.
  4. It sets each nutrient requirement below the baseline achievement using a tolerance scenario: 2% to 5%, 5% to 10%, or 8% to 12%.
  5. It creates finite supply limits under one of three scenarios: seasonal availability, market supply, or normal supply. Limits are multiples of baseline amounts, so the baseline remains intended to fit.
  6. It sets a finite cost budget based on baseline cost: tight 105% to 115%, moderate 110% to 125%, or generous 150% to 200%.
  7. With probability 0.7, it adds minimum amounts for about one-sixth of foods and maximum amounts for about one-fifth of foods, both derived from baseline amounts.

Infeasible

The infeasible path chooses one of four scenarios and then performs a final verification-style strengthening step.

  • Nutrient impossibility: gives foods finite supplies, computes maximum achievable nutrients under those supplies, and sets one nutrient requirement above its maximum.
  • Budget impossibility: sets broad food supplies, creates nutrient requirements, estimates a lower bound on cost required to satisfy them, and sets the budget below that estimate.
  • Supply shortage: sets requirements and supplies, then reduces supplies for foods contributing to a target nutrient until that nutrient cannot be met.
  • Over-constrained system: builds a baseline diet, tightens requirements, supply, budget, minimum consumption of expensive foods, and maximum consumption of nutritious foods, then forces one nutrient above its achievable maximum under the resulting bounds.

After any infeasible scenario, the constructor computes, for each nutrient, the maximum possible amount under food supply and food-specific maximum constraints. It then forces one target nutrient requirement to 200% to 300% of that maximum, or to 100.0 to 200.0 if the maximum is zero. This final step is the strongest infeasibility guarantee in the implementation.

Model Characteristics

The model has n_foods continuous nonnegative variables. It always has n_nutrients nutrient constraints. Additional constraints are added for each finite supply limit, for a finite budget, for each minimum food amount, and for each maximum food amount.

Nutrient rows and the budget row are dense across foods. Supply and food-specific bound constraints are one-variable rows. The formulation is a continuous LP; it does not model integer servings or discrete package counts.

Practical Notes

These instances are useful for testing dense covering-style constraints combined with simple bound rows and an optional budget cap. Feasible instances are constructed around a baseline diet and are meant to be challenging but feasible; infeasible instances include explicit maximum-achievable nutrient checks. One implementation detail to note is that generated nutrient contents use a 0.1 step even for small endpoint ranges, so some scale settings may produce coarser nutrient values than the endpoint precision suggests.

← All generators
Blending & Diet

Feed Blending

FeedBlendingProblem generates a continuous least-cost diet model that mixes ingredients to meet a required batch size, nutrient requirements, ingredient availabilities, and optional nutrient-ratio limits.

Objective: Min Continuous variables

Overview

This generator represents feed formulation or diet blending. A planner chooses how much of each ingredient to include in a fixed-size batch. Ingredients have costs and nutrient content, and the final recipe must meet minimum nutritional requirements while respecting maximum limits for restricted nutrients or anti-nutrients.

Generator Data and Sizing

target_variables maps to ingredient count:

num_ingredients = max(3, target_variables)

The number of nutrients and batch size scale by target size:

  • target_variables <= 250: num_nutrients from 4:8, batch size from Normal(500, 200) truncated to [100, 2000]
  • target_variables <= 1000: num_nutrients from 6:12, batch size from Normal(2000, 800) truncated to [500, 10000]
  • larger targets: num_nutrients from 8:20, batch size from Normal(10000, 5000) truncated to [2000, 50000]

Costs are lognormal and become less dispersed as ingredient count grows:

  • up to 250 ingredients: exp(Normal(log(4.0), 0.8))
  • up to 1000 ingredients: exp(Normal(log(2.5), 0.6))
  • larger: exp(Normal(log(1.8), 0.4))

Each nutrient is assigned a type from 1:4:

  • Type 1 major nutrients: mostly positive max(0, Normal(20, 7)), with occasional high (1.5-2.5x) or low (0.2-0.6x) multipliers.
  • Type 2 minor nutrients: present with probability 0.7, sampled as max(0, Normal(2, 1)), sometimes multiplied by 2-5x.
  • Type 3 trace nutrients: present with probability 0.3, sampled as max(0, Normal(0.5, 0.3)), sometimes multiplied by 3-10x.
  • Type 4 anti-nutrients or upper-limited compounds: present with probability 0.6, sampled as max(0, Normal(5, 3)), sometimes multiplied by 1.5-3x.

The generator repairs all-zero nutrient rows and all-zero ingredient columns by adding positive Normal(2, 1) values. Ingredient availability is Inf unless an availability limit is drawn. The probability of a finite availability is truncated normal, increasing with size: about 0.1-0.4, 0.15-0.45, or 0.2-0.5. Finite availability values are sampled as fractions of batch size from truncated normal distributions whose centers increase with scale.

The struct stores:

  • num_ingredients, num_nutrients
  • batch_size
  • costs
  • nutrient_content::Matrix{Float64}, indexed as nutrient by ingredient
  • nutrient_types
  • min_requirements, max_limits
  • availabilities
  • ratio_constraints::Vector{Tuple}, storing (nutrient_idx, target_pct, type_string)

The constructor uses rng = Random.MersenneTwister(seed) and passes it through most random calls, so it does not intentionally reset Julia's global RNG.

LP Formulation

Sets:

  • I = {1, ..., num_ingredients} ingredients
  • N = {1, ..., num_nutrients} nutrients
  • R ratio constraints

Decision variable:

  • x_i >= 0: amount of ingredient i in the batch

Objective:

\[\min \sum_{i \in I} c_i x_i\]

Fixed batch size:

\[\sum_{i \in I} x_i = B\]

Nutrient minimums and maximums:

\[\sum_{i \in I} a_{ji} x_i \ge r_j \quad \text{for } r_j > 0\]
\[\sum_{i \in I} a_{ji} x_i \le u_j \quad \text{for finite } u_j\]

Finite ingredient availabilities:

\[x_i \le A_i\]

Ratio constraints are average-content bounds. For a minimum target percentage p:

\[\sum_{i \in I} (a_{ji} - p) x_i \ge 0\]

For a maximum target percentage p:

\[\sum_{i \in I} (a_{ji} - p) x_i \le 0\]

The implementation chooses the ratio direction by checking whether the type string contains "min". This means injected strings such as "min_above_achievable" are treated as minimum constraints.

Feasibility Controls

For feasible, the generator first builds a base recipe x0. It samples a Dirichlet allocation across ingredients, clips by finite availabilities, and fills any remaining batch amount by a randomized cheap-ingredient order. If finite availability capacity is too low, it gives the cheapest ingredient enough capacity to cover the batch. Nutrient requirements are then drawn inside achievable minimum/maximum average intervals, with behavior depending on nutrient type. Afterward, the requirements are repaired so the constructed x0 satisfies every active minimum and maximum. Ratio constraints, if generated, are also biased so x0 satisfies them.

For infeasible, the constructor first skips the feasible requirement construction, generates optional random ratio constraints as it would for non-feasible cases, then injects one infeasibility mechanism:

  • A minimum ratio above any ingredient's content for a nutrient with positive content.
  • A minimum ratio above the availability-aware achievable maximum average, usually for a major nutrient.
  • A maximum ratio below the availability-aware achievable minimum average.
  • An availability shortage where total ingredient capacity is forced below batch size.

For unknown, nutrient requirements are generated randomly without repair guarantees. Minimum requirement factors come from truncated Normal(0.4, 0.1) on [0.2, 0.6]; maximum limit factors come from truncated Normal(1.5, 0.2) on [1.1, 2.0]. Ratio constraints are generated as in non-feasible cases. The result may be feasible or infeasible.

Model Characteristics

The variable count is num_ingredients. Constraint count is driven by one batch equality, active nutrient minimums, active nutrient maximums, finite ingredient availabilities, and ratio constraints. Nutrient and ratio rows are generally dense over all ingredients, though the nutrient matrix itself can contain many zeros, especially for minor and trace nutrients. Availability constraints are sparse one-variable rows.

The model is a continuous LP. Ingredient amounts are divisible; no integer batch, package, or recipe-count restrictions are modeled.

Practical Notes

This generator is useful for testing diet-style LPs with sparse nutrient matrices, fixed total mass, and feasibility controls based on achievable average nutrient content. Ratio constraints are stored as strings rather than a typed enum, and the model interprets any string containing "min" as a lower-bound ratio. Infeasible construction is stronger than random bad data because it uses achievable average calculations or total capacity shortage, but the constructor itself does not solve the instance.

← All generators
Production & Planning

Energy

Generates energy generation mix LPs that dispatch multiple power sources over time to meet demand at minimum generation cost.

Objective: Min Continuous variables

Overview

This generator represents short-horizon power generation planning. It creates conventional and renewable generation sources, time-varying demand, source capacities, source costs, emissions rates, and a minimum renewable share. The optimization model decides generation by source and time period.

Generator Data and Sizing

target_variables is used to tune a requested n_sources * n_periods product. The constructor starts from scale-dependent source and period ranges, initializes n_sources = min_sources + 2 and n_periods = min_periods + 12, then iteratively adjusts those requested dimensions for up to 15 iterations.

Scale conditionSourcesPeriodsPeak demand
target_variables < 2503-812-4810-100
target_variables < 10005-1224-72100-1000
otherwise8-2048-2001000-10000

Potential source types are:

SourceRenewableAvailabilityCapacity factorCost factor
coalno0.950.901.0
gasno0.980.851.2
nuclearno0.920.950.8
solaryes0.990.250.3
windyes0.950.350.4
hydroyes0.900.500.6
biomassyes0.880.751.1

The built JuMP model indexes source variables over the selected sources vector, not over 1:n_sources. Because there are only seven hard-coded source types, the actual number of modeled sources is length(sources) and can be smaller than the stored n_sources metadata when the requested source count exceeds the available renewable and conventional type pools.

Random data generation:

  • Renewable share target is beta-distributed: small Beta(2,3), medium Beta(3,4), large Beta(4,5).
  • Demand variation is beta-distributed and decreases with scale.
  • Base generation cost is log-normal around 60, 45, or 35 depending on scale.
  • Renewable cost factor is gamma-distributed.
  • Capacity margin is clipped normal: small about 1.35, medium about 1.25, large about 1.15.
  • Source costs combine base cost, source cost factor, renewable cost factor if applicable, and source-specific random variation.
  • Capacity shares use source-specific gamma, beta, or log-normal samples, normalized to total required capacity.
  • Demand follows residential, commercial, or industrial 24-hour profiles depending on peak demand, with weather, economic, random, and seasonal multiplicative noise.
  • Emissions are 0.0 for renewable sources and nuclear, emission_limit for coal, and half that for gas.

The stored struct fields are:

  • n_sources
  • n_periods
  • sources
  • time_periods
  • generation_costs
  • capacities
  • demands
  • emission_limits
  • renewable_fraction

The constructor calls Random.seed!(seed), so generation is reproducible for a fixed seed but resets Julia's global RNG state.

LP Formulation

Sets and indices:

  • S: selected generation sources.
  • T = {1, ..., n_periods}: time periods.
  • R = {s in S: emission_limits[s] == 0}: zero-emission sources used as renewables in the model.

Decision variables:

0 <= x_{s,t} <= capacity_s

x_{s,t} is the generation from source s in period t.

Objective:

minimize sum_{s in S} sum_{t in T} cost_s x_{s,t}

Constraints:

Demand satisfaction:

sum_{s in S} x_{s,t} >= demand_t    for each t in T

Emissions:

sum_{s in S} emission_s x_{s,t}
    <= max_emission * sum_{s in S} x_{s,t}    for each t in T

where max_emission = maximum(values(emission_limits)).

Renewable fraction:

sum_{s in R} x_{s,t}
    >= renewable_fraction * sum_{s in S} x_{s,t}    for each t in T

Bounds:

All variables are continuous and bounded above by source capacity in every period.

Feasibility Controls

  • feasible: The generator checks total nameplate capacity against peak_demand * capacity_margin and scales all capacities up if needed. It does not solve the final LP, but it tries to ensure adequate capacity before returning.
  • infeasible: One of four scenarios is sampled:
    • Capacity crisis: all capacities are scaled so total capacity is only 60% to 80% of peak demand.
    • Renewable intermittency: renewable_fraction is raised to 0.7 to 0.9, then renewable capacity is reduced to about 40% to 60% of the required renewable capacity.
    • Emission impossibility: fossil emissions are reduced near zero and clean capacity may be capped around 50% to 70% of peak demand.
    • Demand surge: demands are multiplied so peak demand exceeds current total capacity by roughly 10% to 30%.
  • unknown: The generator samples feasible behavior with probability 0.7; otherwise it samples one of the infeasible scenarios.

Model Characteristics

  • Variables: length(sources) * n_periods in the built model. The stored

    n_sources field records the requested adjusted count, but variables are created only for the selected named source types.

  • Constraints: 3 * n_periods: one demand, one emissions, and one renewable-share constraint per period.
  • Bounds: one upper bound from capacity for each selected source-period pair.
  • Density: each period constraint touches all source variables for that period, so the matrix is block diagonal by time period.
  • Model class: continuous LP; there are no unit commitment, startup, ramping, storage, or integer investment decisions.

Practical Notes

These instances are useful for testing dispatch-style LPs with repeated time-period blocks, upper-bounded variables, and renewable-share constraints. The emissions constraint is unusual: because it uses the maximum emissions rate among sources as the right-hand coefficient, it is generally nonbinding for nonnegative generation. Infeasibility is therefore mostly driven by capacity, demand, and renewable-fraction interactions rather than by the emissions row itself.

← All generators
Production & Planning

Inventory

InventoryProblem generates a multi-period production and inventory planning model with optional backlogging, capacity limits, stochastic demand patterns, and controlled feasibility.

Objective: Min Continuous variables

Overview

This generator represents single-item inventory control over a planning horizon. A planner chooses production in each period, carries inventory when production exceeds demand, and may incur backlog costs if backorders are allowed. When backlogging is disabled, every prefix of demand must be coverable by initial inventory plus cumulative production capacity.

Generator Data and Sizing

The generator first selects a scale from target_variables:

  • <= 250: small
  • <= 1000: medium
  • larger: large

Backlogging is enabled with probability 0.2, 0.4, or 0.6 for small, medium, or large instances. The horizon is chosen from the variable-budget formula:

  • With backlogging: variables are x[1:T], I_plus[0:T], and I_minus[0:T], so 3T + 2.
  • Without backlogging: variables are x[1:T] and I[0:T], so 2T + 1.

The initial n_periods is rounded from that formula, bounded to [2, 5000], then adjusted for up to 10 iterations until the variable count is within 10 percent of the target or cannot improve.

Scale-specific distributions:

  • Small: production capacity Uniform(50, 500), demand base Uniform(10, 100), demand volatility 0.2-0.5, initial inventory 10-50 percent of average demand, production cost base 10-100, production cost spread 0.1-0.3, annual holding rate 0.05-0.25 divided by 12.
  • Medium: production capacity Uniform(200, 2000), demand base Uniform(50, 1000), demand volatility 0.15-0.4, initial inventory 5-40 percent of average demand, production cost base 5-200, production cost spread 0.05-0.25, annual holding rate 0.03-0.20 divided by 12.
  • Large: production capacity Uniform(1000, 50000), demand base Uniform(100, 10000), demand volatility 0.1-0.3, initial inventory 2-30 percent of average demand, production cost base 1-500, production cost spread 0.02-0.20, annual holding rate 0.01-0.15 divided by 12.

Demands are sampled from a normal distribution centered at the demand range midpoint with standard deviation one quarter of the range, then clamped. The generator may add annual, weekly, and quarterly sinusoidal seasonality; exponential trends in production or holding costs; and occasional demand disruptions from a Poisson count. Backlog costs equal production costs multiplied by Uniform(1.5, 5.0).

The struct stores:

  • n_periods
  • prod_capacity
  • initial_inventory
  • backlog_allowed
  • demands
  • production_costs
  • holding_costs
  • backlog_costs

The constructor calls Random.seed!(seed), resetting Julia's global RNG.

LP Formulation

Sets:

  • T = {1, ..., n_periods} periods

Decision variables without backlogging:

  • x_t >= 0: production in period t
  • I_t >= 0: ending inventory in period t, including I_0

Objective without backlogging:

\[\min \sum_{t \in T} p_t x_t + h_t I_t\]

Constraints without backlogging:

\[I_0 = I^{init}\]
\[I_{t-1} + x_t - d_t = I_t \quad \forall t \in T\]
\[x_t \le C \quad \forall t \in T\]

Decision variables with backlogging:

  • x_t >= 0: production in period t
  • I^+_t >= 0: positive ending inventory
  • I^-_t >= 0: backlog

Objective with backlogging:

\[\min \sum_{t \in T} p_t x_t + h_t I^+_t + b_t I^-_t\]

Constraints with backlogging:

\[I^+_0 = I^{init}\]
\[I^-_0 = 0\]
\[I^+_{t-1} - I^-_{t-1} + x_t - d_t = I^+_t - I^-_t \quad \forall t \in T\]
\[x_t \le C \quad \forall t \in T\]

The implemented model is continuous. Production, inventory, and backlog can be fractional.

Feasibility Controls

The constructor maps feasibility_status to :feasible, :infeasible, or :all. For unknown, it leaves :all and applies no feasibility enforcement branch, so the initially generated stochastic instance is returned after base generation.

For feasible instances, if backlogging is disabled, the generator may take realistic operator actions: increase production capacity by 10-25 percent, raise initial inventory with safety stock, enable backlogging, or smooth peak demands while slightly increasing capacity. It then performs a surgical feasibility pass. If no-backlog cumulative demand has a prefix shortfall, it computes the minimum capacity needed per prefix and either raises capacity, adds enough initial inventory, or enables backlogging.

For infeasible instances, the generator forces backlog_allowed = false, then creates one of several disruptions: sustained high demand, reduced capacity and lower starting inventory, supplier disruption effects, or very low starting stock with high variability. Afterward it recomputes cumulative demand and guarantees prefix infeasibility. If the instance is still feasible, it lowers production capacity below the minimum prefix capacity needed.

The key feasibility test is the maximum prefix shortfall:

\[\max_t \left(\sum_{\tau=1}^t d_\tau - I^{init} - tC\right)\]

For no-backlog models, a positive value implies infeasibility. With backlogging allowed, demand balance can always carry shortage forward subject only to production capacity and nonnegative backlog variables.

Model Characteristics

Without backlogging, variables are 2T + 1; constraints are one initial-inventory equality plus T flow equalities and T capacity inequalities. With backlogging, variables are 3T + 2; constraints are two initial equalities plus T flow equalities and T capacity inequalities. The constraint matrix is very sparse and banded over adjacent periods.

The model is a continuous LP. It does not include setup decisions, lot sizes, fixed ordering costs, or integer production quantities.

Practical Notes

This generator is useful for testing long, sparse, time-linked LPs and prefix-feasibility behavior. unknown is unusual here compared with several other generators: it does not randomly choose feasible or infeasible; it returns the base stochastic data without explicit feasibility repair or sabotage. Feasible requests may switch a no-backlog instance into a backlog-allowed instance if that is the selected or necessary repair.

← All generators
Production & Planning

Product Mix

Product mix generates a continuous profit-maximization LP that chooses production levels across many products with resource capacities and optional market lower and upper bounds.

Objective: Max Continuous variables

Overview

This generator represents a product mix planning problem in which a manufacturer chooses quantities for multiple products. Products earn profit and consume resources; optional lower bounds model minimum market commitments, while optional upper bounds model market saturation or sales limits.

The generator includes scale-dependent and industry-dependent data generation. It can produce small, medium, and large operations with different resource counts, profit ranges, usage ranges, sparsity levels, and market-bound frequencies.

Generator Data and Sizing

target_variables maps directly to products:

num_products = max(2, min(10000, target_variables))

Resource count and parameter distributions depend on the requested scale.

For target_variables <= 250:

  • num_resources: DiscreteUniform(3, 8).
  • sparsity: Beta(2, 6).
  • profit_min: LogNormal(log(15), 0.4).
  • profit_max: LogNormal(log(120), 0.3).
  • resource_usage_min: LogNormal(log(1.0), 0.3).
  • resource_usage_max: LogNormal(log(5), 0.3).
  • market_constraint_prob: Beta(4, 6).
  • correlation_strength: Beta(4, 3).

For 250 < target_variables <= 1000:

  • num_resources: selected from 5:15 using a Beta(2, 3) sample.
  • sparsity: Beta(3, 4).
  • profit_min: LogNormal(log(8), 0.5).
  • profit_max: LogNormal(log(75), 0.4).
  • resource_usage_min: LogNormal(log(0.6), 0.4).
  • resource_usage_max: LogNormal(log(4.5), 0.4).
  • market_constraint_prob: Beta(5, 5).
  • correlation_strength: Beta(6, 4).

For target_variables > 1000:

  • num_resources: round(Int, LogNormal(log(18), 0.4)), clamped to 8:30.
  • sparsity: Beta(2, 3).
  • profit_min: LogNormal(log(3), 0.6).
  • profit_max: LogNormal(log(45), 0.5).
  • resource_usage_min: LogNormal(log(0.3), 0.5).
  • resource_usage_max: LogNormal(log(4), 0.5).
  • market_constraint_prob: Beta(6, 4).
  • correlation_strength: Beta(8, 3).

The generator samples an industry type from:

manufacturing, food_processing, electronics, furniture, chemical, automotive

with scale-dependent weights. Industry type then modifies profit ranges, resource usage ranges, sparsity, market-bound probability, and/or correlation strength. For example, electronics increases profit and usage maxima and raises sparsity, while automotive strongly increases profit and usage ranges but lowers sparsity and market-bound probability.

Generated data:

  • quality_factors[j]: Beta(2, 2) per product.
  • profits[j]: log-normal base profit clamped to [profit_min, profit_max], plus a quality-correlated component.
  • usage_matrix[i, j]: resource i usage by product j. Entries are zero with probability sparsity; otherwise they combine a resource-level base usage, a gamma random component, and quality correlation.
  • Each product is forced to use at least one resource.
  • Each resource is forced to be used by at least one product.
  • availabilities[i]: based on average usage per resource times num_products / 2, multiplied by log-normal variability clamped to [0.5, 2.0] and a factor in roughly [0.6, 1.2].
  • lower_bounds[j]: optional market floor, initially zero.
  • upper_bounds[j]: optional market cap, initially Inf.

Market bounds are generated from each product's single-product max_possible[j], computed as the minimum availability-to-usage ratio across resources with positive usage. Each product independently receives a lower bound with probability market_constraint_prob / 2 and an upper bound with the same probability.

The stored struct fields are:

  • num_products::Int
  • num_resources::Int
  • profits::Vector{Float64}
  • usage_matrix::Matrix{Float64}
  • availabilities::Vector{Float64}
  • lower_bounds::Vector{Float64}
  • upper_bounds::Vector{Float64}

The constructor calls Random.seed!(seed), so generation is reproducible for the same arguments and package version, while also resetting Julia's global RNG.

LP Formulation

Sets and indices:

  • Products j in P = {1, ..., num_products}.
  • Resources i in R = {1, ..., num_resources}.

Decision variables:

x_j >= 0    quantity of product j to produce

Objective:

\[\max \sum_{j \in P} profit_j x_j\]

Resource constraints:

\[\sum_{j \in P} usage_{i,j} x_j \le availability_i \quad \forall i \in R\]

Market lower-bound constraints are added only when lower_bounds[j] > 0:

\[x_j \ge lower\_bound_j\]

Market upper-bound constraints are added only when upper_bounds[j] < Inf:

\[x_j \le upper\_bound_j\]

Bounds:

  • All variables are continuous and nonnegative.
  • Upper bounds are optional and product-specific.

Interpretation: the model chooses a profit-maximizing production portfolio while respecting limited resources and product-level market requirements.

Feasibility Controls

The constructor converts the requested status into:

:feasible      if feasibility_status == feasible
:infeasible    if feasibility_status == infeasible
:all           if feasibility_status == unknown

For feasible, the generator:

  1. Repairs any product where lower_bounds[j] > upper_bounds[j] by setting the lower bound to 0.98 * upper_bounds[j].
  2. Computes aggregate resource usage required by the lower bounds.
  3. If any resource requirement exceeds availability, scales all lower bounds down by minimum(availability / required) * 0.98.
  4. Rechecks lower bounds against finite upper bounds.

This makes the lower-bound point feasible with respect to resource capacities and finite product caps.

For infeasible, the generator:

  1. Ensures there are positive lower bounds. If none exist, it selects products using a heavily used resource and assigns lower bounds equal to (0.15 + 0.25 * rand()) * max_possible[j].
  2. Computes required resource usage from lower bounds.
  3. Chooses the resource with the largest required-to-available ratio, or a heavily used fallback resource if requirements are zero.
  4. Reduces that resource availability to required[critical_i] * (1 - shortage_margin), where shortage_margin is sampled from 0.10 to 0.35.
  5. With probability 0.3, also reduces another resource to slightly below its required lower-bound usage.

For unknown, no repair or forced infeasibility branch is applied after the initial stochastic construction. The instance may be feasible or infeasible depending on the sampled bounds and capacities.

Model Characteristics

Variable count:

num_products

Constraint count drivers:

  • num_resources capacity constraints.
  • One lower-bound constraint for each positive lower_bounds[j].
  • One upper-bound constraint for each finite upper_bounds[j].

The usage matrix is sparse by construction: each entry is independently set to zero with probability sparsity, followed by repair passes that prevent all-zero product columns and all-zero resource rows.

The model is a continuous LP. Product quantities are divisible; no integer or batch constraints are enforced.

Practical Notes

This generator is useful for benchmarking product-mix LPs with more varied structure than the simpler production planning generator. It introduces sparse resource consumption, correlated profit/quality/usage patterns, industry-specific regimes, and both lower and upper market bounds.

For unknown, the code deliberately leaves the sampled instance uncorrected, so the status is not a randomized 70/30 choice as in some other generators. The imported LinearAlgebra module is not used by this file.

← All generators
Production & Planning

Production Planning

Production planning generates a continuous profit-maximization LP that chooses production quantities for products subject to shared resource capacities and optional minimum production requirements.

Objective: Max Continuous variables

Overview

This generator represents a classical production planning setting: a firm can manufacture several products, each product earns a per-unit profit, and each unit consumes limited resources such as labor, machine time, materials, or capacity. The optimization chooses how much of each product to make in order to maximize profit without exceeding available resources.

The infeasible variant adds minimum production commitments for selected products and then reduces a critical resource capacity so that those minimum commitments cannot all be met.

Generator Data and Sizing

target_variables maps directly to products:

n_products = max(2, min(2000, target_variables))

The number of resources is sampled independently:

n_resources = rand(1:50)

Generated random data:

  • profits[i]: integer profit for product i, sampled uniformly from 10:500.
  • usage[i, j]: resource j consumed per unit of product i, sampled uniformly from the float range 0.1:50.0.
  • resource_factor: sampled from 0.4:0.1:0.8.
  • resources[j]: set to sum_i usage[i, j] * resource_factor, so each resource capacity is a fixed fraction of the total consumption that would occur if every product were produced at one unit.
  • min_production[i]: initialized to zero and only made positive by the infeasible branch.

The stored struct fields are:

  • n_products::Int
  • n_resources::Int
  • profits::Vector{Int}
  • usage::Matrix{Float64}
  • resources::Vector{Float64}
  • min_production::Vector{Float64}

The constructor calls Random.seed!(seed), so the generator resets Julia's global RNG and is reproducible for the same arguments and package version.

LP Formulation

Sets and indices:

  • Products i in P = {1, ..., n_products}.
  • Resources j in R = {1, ..., n_resources}.

Decision variables:

x_i >= 0    quantity of product i to produce

Objective:

\[\max \sum_{i \in P} profit_i x_i\]

Resource constraints:

\[\sum_{i \in P} usage_{i,j} x_i \le resource_j \quad \forall j \in R\]

Minimum production constraints are added only for products with positive min_production[i]:

\[x_i \ge min\_production_i\]

Bounds:

  • All variables are continuous and nonnegative.
  • There are no explicit upper bounds on products except those implied by resource capacities.

Interpretation: the model chooses a product mix that consumes no more than each available resource and maximizes total profit. Positive minimum production values model contractual, policy, or demand-floor commitments.

Feasibility Controls

The constructor first sets actual_status = feasibility_status. If feasibility_status == unknown, it randomly chooses feasible with probability 0.7 and infeasible with probability 0.3.

For feasible, the generator leaves min_production at all zeros. Because x = 0 satisfies all resource constraints, the generated LP is feasible.

For infeasible, the generator:

  1. Computes a per-product single-product maximum:
       max_possible[i] = minimum(resources[j] / usage[i, j] for j in 1:n_resources)
  2. Selects n_constrained products, where:
       n_constrained = max(2, rand(max(1, div(n_products, 4)):max(2, div(n_products, 2))))
  3. Sets each selected minimum production to max_possible[i] * (0.3 + 0.3 * rand()).
  4. Computes the resource usage required by all minimum productions.
  5. Finds the resource with the largest required-to-available ratio.
  6. Reduces that critical resource to required[critical_j] * (0.7 + 0.2 * rand()).

This final reduction makes the lower-bound requirements exceed the capacity of at least the critical resource.

For unknown, the selected actual status follows the same feasible or infeasible path described above.

Model Characteristics

Variable count:

n_products

Constraint count drivers:

  • n_resources capacity constraints.
  • One extra lower-bound constraint for each product with min_production[i] > 0.

The resource matrix is dense. Every sampled usage[i, j] is positive because the range starts at 0.1, so each resource constraint includes every product.

The model is a continuous LP. Product quantities are not integer-restricted, so this is a production-rate or divisible-production relaxation rather than a batch/integer production model.

Practical Notes

This generator is useful for dense continuous LP benchmarks with direct control over the number of variables. The feasible instances are structurally simple because zero production is always feasible and all profits are positive, so the optimum is driven by resource bottlenecks rather than demand satisfaction. The infeasible instances are created by lower-bound commitments and a targeted resource shortage.

The number of resources is independent of target_variables, so small product sets can still receive many capacity constraints and large product sets can receive only a few.

← All generators
Production & Planning

Resource Allocation

Resource allocation generates a continuous profit-maximization LP that allocates limited resources among competing activities with optional minimum activity levels.

Objective: Max Continuous variables

Overview

This generator models a generic resource allocation problem: activities earn profit but consume multiple limited resources. The decision is how much of each activity to run. Examples include allocating labor, budget, machine time, or raw materials across departments, campaigns, jobs, or production modes.

The generator can construct feasible instances by building capacities around a synthetic demand plan, or construct infeasible instances by forcing minimum activity levels to exceed selected resource capacities.

Generator Data and Sizing

target_variables maps directly to activities:

n_activities = max(3, min(2000, target_variables))

Resource count and parameter ranges are sampled as:

  • n_resources: rand(2:50).
  • min_resource: integer from 50:1000.
  • max_resource: integer from 200:10000.
  • min_profit: from 0.1:0.1:2.0.
  • max_profit: from 5.0:5.0:100.0.
  • min_usage: from 0.01:0.01:0.5.
  • max_usage: from 1.0:1.0:20.0.
  • correlation_strength: from 0.5:0.1:0.9.
  • add_min_constraints: true with probability 0.7, always true for requested infeasible instances.
  • min_level_prob: from 0.2:0.1:0.5.

Generated data:

  • quality_factors[i]: uniform random quality in [0, 1).
  • profits[i]: base uniform profit plus a quality-correlated profit component.
  • usage[i, j]: dense resource usage; base uniform usage plus a quality-correlated usage component.
  • resources[j]: generated either stochastically for unknown or constructively from a demand plan for requested feasible/infeasible cases.
  • min_levels[i]: optional activity minimums, initialized to zero.

Two helper calculations are used in constructive modes:

  • compute_demand_plan(...): builds a normalized positive activity plan based on profit, average usage, and quality.
  • compute_single_activity_caps(...): computes each activity's maximum single-activity level under the current resource capacities.

The stored struct fields are:

  • n_activities::Int
  • n_resources::Int
  • profits::Vector{Float64}
  • usage::Matrix{Float64}
  • resources::Vector{Float64}
  • min_levels::Vector{Float64}

The constructor calls Random.seed!(seed), making instances reproducible for identical inputs while resetting Julia's global RNG.

LP Formulation

Sets and indices:

  • Activities i in A = {1, ..., n_activities}.
  • Resources j in R = {1, ..., n_resources}.

Decision variables:

x_i >= 0    level of activity i

Objective:

\[\max \sum_{i \in A} profit_i x_i\]

Resource constraints:

\[\sum_{i \in A} usage_{i,j} x_i \le resource_j \quad \forall j \in R\]

Minimum-level constraints are added only when min_levels[i] > 0:

\[x_i \ge min\_level_i\]

Bounds:

  • All variables are continuous and nonnegative.
  • There are no explicit upper bounds except those implied by resource capacities.

Interpretation: the model allocates scarce resources to the most profitable activity levels while satisfying any required minimum levels.

Feasibility Controls

The constructor maps the requested status to:

:feasible      if feasibility_status == feasible
:infeasible    if feasibility_status == infeasible
:all           if feasibility_status == unknown

For unknown (:all), the generator uses the original stochastic construction:

  1. Computes expected usage per resource.
  2. Sets resources to a random fraction of expected aggregate usage, then clamps each resource to [min_resource, max_resource].
  3. If minimum constraints are enabled, assigns each activity a positive minimum with probability min_level_prob; each minimum is 0.1, 0.15, 0.2, 0.25, or 0.3 times that activity's single-activity maximum.

This branch does not force a particular feasibility outcome.

For feasible, the generator:

  1. Builds a demand plan from profits, usage, and quality.
  2. Sets each resource to demand-plan consumption times a capacity anchor from 1.10 to 1.50, then clamps to [min_resource, max_resource].
  3. Scales the demand plan to fit the post-clamp capacities.
  4. Creates a baseline activity vector at roughly 75% to 95% of the fitted scale.
  5. Optionally assigns minimum levels as 10% to 30% of positive baseline levels.

The positive minimum levels are therefore constructed below a known baseline that fits the resource capacities.

For infeasible, the generator starts similarly but with tighter capacity anchors from 0.95 to 1.10 and a baseline at roughly 70% to 90% of the fitted scale. It then:

  1. Ensures minimum constraints are enabled and at least one or two minimum levels are selected when possible.
  2. Clips existing positive minimum levels to at most 0.8 times each single-activity cap.
  3. Selects one to three resources to violate, biased toward the highest current minimum-consumption ratios.
  4. Sets target required consumption for those resources to resources[r] * (1.05 to 1.25).
  5. Raises selected minimum levels iteratively until the target violation is reached or candidates are exhausted.
  6. Performs final checks and may reduce a violated resource capacity to below the final required minimum consumption.

The intended infeasibility is that the sum of resource usage implied by mandatory minimum activity levels exceeds one or more resource capacities.

Model Characteristics

Variable count:

n_activities

Constraint count drivers:

  • n_resources dense resource constraints.
  • One lower-bound constraint for each positive min_levels[i].

The usage matrix is dense because every entry is assigned a positive continuous value. Constraint density is therefore high: every resource row involves every activity.

The model is a continuous LP. Activities are divisible, with no integer or binary restrictions.

Practical Notes

This generator is useful for dense resource-allocation LPs where profit and resource usage are positively correlated through a quality factor. Requested feasible and infeasible modes are more constructive than the unknown mode, which remains stochastic.

Because both min_resource and max_resource are sampled independently, it is possible for min_resource to exceed max_resource. The code then applies max.(resources, min_resource) followed by min.(resources, max_resource) or equivalent scalar clamps, so the final cap can be driven to max_resource in those cases. This is an implementation quirk to keep in mind when interpreting sampled resource scales.

← All generators
Assignment & Scheduling

Airline Crew

Generates airline crew pairing set-partitioning instances where each flight must be covered by exactly one crew pairing at minimum cost.

Objective: Min Binary variables

Overview

This generator represents the crew pairing step in airline operations planning. A flight network is generated over base and non-base airports, then feasible-looking pairings are built as sequences of connected flights. The optimization model chooses pairings so that every flight in the planning horizon is covered exactly once.

Generator Data and Sizing

target_variables is interpreted as the target number of pairing variables. The constructor first samples a scale:

Scale conditionAirportsBasesFlightsMax pairingsMax flights per pairing
target_variables <= 3005-122-440-8050-1503-5
target_variables <= 150010-253-680-200150-5004-6
otherwise20-605-15150-600400-20005-8

The sampled flight and pairing counts are iteratively adjusted for up to 15 iterations. The sizing target used during adjustment is:

current_vars = min(max_pairings, num_flights * 2)

The final number of pairings is min(max_pairings, actual_num_flights * 2), so the realized variable count can differ from target_variables.

Generated data:

  • Airports 1:num_bases are crew bases; the rest are non-base airports.
  • Flights are first generated in hub-and-spoke patterns from bases to non-bases with probability 0.7, and from non-bases to bases with probability 0.6.
  • Any remaining flights are filled with random point-to-point origin-destination pairs with distinct endpoints.
  • Flight costs are sampled from truncated normals: small Normal(1200:2000, 300:600) truncated to [500, 10000], medium Normal(2000:3500, 600:1200), large Normal(3000:6000, 1000:2500).
  • Base-to-non-base and point-to-point costs receive additional multiplicative beta noise.
  • Pairing overhead means are sampled from truncated normals around 0.2, 0.25, or 0.35 by scale; overhead standard deviations are similarly sampled and truncated.
  • Pairing length is sampled with weights [0.4, 0.3, 0.15, 0.1, 0.04, 0.01], truncated to the maximum allowed length.

The stored struct fields are:

  • num_flights
  • flight_origins
  • flight_destinations
  • pairing_costs
  • flights_in_pairing

The constructor calls Random.seed!(seed), so generation is reproducible for a fixed seed but resets Julia's global RNG state.

LP Formulation

Sets and indices:

  • F = {1, ..., num_flights}: flights.
  • P = {1, ..., length(pairing_costs)}: generated pairings.
  • A_p subset F: flights contained in pairing p.

Decision variables:

x_p in {0, 1}

x_p = 1 means pairing p is selected.

Objective:

minimize sum_{p in P} c_p x_p

where c_p is pairing_costs[p].

Constraints:

sum_{p in P: f in A_p} x_p = 1    for each f in F

Each flight must be covered exactly once. There are no explicit connection, duty-time, crew-base, or rest constraints in the JuMP model; those are represented only indirectly through how pairings are generated.

Bounds:

x_p binary

At the package API level, generate_problem(...; relax_integer=true) is the default, so these binary variables are relaxed unless the caller sets relax_integer=false.

Feasibility Controls

  • feasible: The generator builds an exact cover first. It repeatedly assigns unassigned flights to pairings, removes covered flights from the unassigned set, then adds extra random pairings until the target count is reached. The initial partition provides a feasible integer solution before relaxation.
  • infeasible: The generator chooses one random flight and puts it in an avoid_set. All generated pairings avoid that flight. The model still includes the equality constraint for the avoided flight, so its covering sum is empty and the constraint is 0 == 1.
  • unknown: The generator randomly selects feasible behavior with probability 0.5; otherwise it uses the infeasible behavior above.

Model Characteristics

  • Variables: length(pairing_costs), approximately min(max_pairings, 2 * num_flights).
  • Constraints: exactly num_flights covering equalities.
  • Nonzeros: driven by total flight appearances across all pairings. Pairing lengths are short, so the matrix is sparse.
  • Intended model class: binary set partitioning.
  • Default generated LP: with the package default relax_integer=true, the binary pairing choices are converted to continuous variables, yielding the LP relaxation.

Practical Notes

These instances are useful for testing set-partitioning structure, sparse exact-cover constraints, and binary-to-LP relaxation behavior. Costs encode some operational realism, including base start/end discounts and length penalties, but the generated pairings are not validated against real airline duty rules. In infeasible mode the infeasibility is structural and direct: at least one flight has no covering pairing.

← All generators
Assignment & Scheduling

Assignment

The assignment generator creates worker-task matching models with binary assignment variables, compatibility restrictions, and randomized cost structure.

Objective: Min Binary variables

Overview

This generator represents assigning workers to tasks at minimum cost. Each task must receive exactly one worker, each worker can perform at most one task, and some worker-task pairs may be forbidden by a compatibility matrix. Costs can reflect specialization and worker/task group affinity.

Generator Data and Sizing

target_variables maps to the worker-task matrix size:

target_variables ~= n_workers * n_tasks

The constructor seeds Julia's global RNG with Random.seed!(seed), so all dimension choices, compatibility choices, and costs are reproducible for the same inputs.

Size-dependent parameters:

Target variablesBalanced probabilityBase cost rangeSpecialization probabilityCost variation weights
<= 2500.85 to 300.2low 0.6, medium 0.3, high 0.1
<= 10000.610 to 1000.4low 0.3, medium 0.5, high 0.2
> 10000.450 to 5000.6low 0.2, medium 0.3, high 0.5

Balanced instances use a square matrix with n_workers = n_tasks near sqrt(target_variables), both at least 5. Unbalanced instances use a random ratio in [0.5, 2.0) and then make up to three adjustment passes to move the product closer to the target.

After feasibility adjustments, the generator assigns workers and tasks to random skill groups. The number of groups is sampled from 2:gmax, where gmax = min(6, max(2, round(sqrt(min(n_workers, n_tasks))))). Compatibility density is based on final matrix size:

  • 0.85 for <= 250 possible assignments
  • 0.70 for <= 1000
  • 0.50 for larger matrices

Within-group compatibility probability is min(0.98, base_density). Cross-group compatibility probability is max(0.02, 0.3 * base_density).

The allowed matrix is only randomized when feasibility_status is feasible or infeasible. For unknown, it remains all true.

Costs are integer values. The upper end of the base cost range is multiplied by a random factor in [0.8, 1.2), with a minimum spread of 5 above the low cost. Specialized workers have a few low-cost tasks and higher costs elsewhere. Non-specialized costs use low, medium, or high variation, with lower expected costs for matching worker/task groups.

The struct stores:

  • n_workers::Int
  • n_tasks::Int
  • costs::Matrix{Int}
  • allowed::Matrix{Bool}

LP Formulation

Sets:

  • W = {1, ..., n_workers} for workers
  • T = {1, ..., n_tasks} for tasks

Decision variable:

x[i,j] in {0,1} = 1 if worker i is assigned to task j

Objective:

minimize sum_{i in W, j in T} costs[i,j] * x[i,j]

Worker capacity:

sum_{j in T} x[i,j] <= 1    for each worker i

Task coverage:

sum_{i in W} x[i,j] = 1     for each task j

Compatibility restrictions:

x[i,j] = 0                  for each forbidden pair where allowed[i,j] == false

Bounds are binary bounds from the JuMP Bin declaration.

Feasibility Controls

The constructor maps statuses to internal symbols: feasible becomes :feasible, infeasible becomes :infeasible, and unknown becomes :all.

  • feasible: if there are fewer workers than tasks, it adds enough workers to cover all tasks and may add 1 to 3 extra workers with probability 0.3. If randomized compatibility is active, it then constructs a task order and ensures each task has at least one allowed, previously unused worker, editing allowed when necessary. This creates a matching witness at the compatibility level.
  • infeasible: forces an unbalanced shortage when needed by setting n_tasks above n_workers by a gap of about 5% to 25% of workers. Since every task must be assigned and every worker can take at most one task, this capacity shortfall is infeasible. With probability 0.4, it instead or additionally creates a Hall violation by selecting a subset of tasks and allowing them to be served only by a smaller subset of workers.
  • unknown: does not adjust dimensions for feasibility and does not randomize compatibility; allowed remains all true.

For infeasible, the worker/task count shortfall is a direct infeasibility certificate when applied. Hall violations are also designed to be infeasible because more tasks are restricted to fewer workers.

Model Characteristics

The model has n_workers * n_tasks binary variables. It has n_workers worker-capacity constraints, n_tasks task-coverage constraints, and one equality-fixing constraint for each forbidden assignment.

This is a mixed-integer model as built by build_model, not a pure continuous LP. If the package-level generation path is called with integer relaxation enabled, the binary variables may be relaxed outside this file; the generator itself declares them as Bin.

The dense variable matrix is structurally sparse in constraints: each assignment variable appears in one worker row and one task row, plus one compatibility-fixing row if forbidden.

Practical Notes

These instances are useful for testing assignment structure, binary relaxation behavior, and compatibility-driven infeasibility. The unknown mode is unusual because it keeps all assignments allowed, so infeasibility in that mode is mainly from a natural worker/task count imbalance rather than compatibility. The generator stores costs for forbidden pairs too, but the model fixes those variables to zero.

← All generators
Assignment & Scheduling

Cutting Stock

CuttingStockProblem generates a cutting-pattern model that minimizes the number of stock pieces used to satisfy demand for required piece lengths.

Objective: Min Integer variables

Overview

This generator represents a one-dimensional cutting stock planning problem. A manufacturer has stock material of a standard length and must cut it into demanded piece lengths. The model chooses how many times to use each generated cutting pattern.

Generator Data and Sizing

target_variables is used as a soft target for the number of generated cutting patterns:

max_patterns = target_variables

The actual number of variables in the built model is length(patterns). It can be less than target_variables if duplicate complex patterns are rejected or the pattern generator exhausts attempts. It can also exceed target_variables when mandatory single-piece patterns are added for more piece types than the requested pattern target. Piece-type counts and distributions scale by target size:

  • target_variables <= 250: n_piece_types from 3:min(15, max(3, target_variables / 10)), stock length from Uniform(3, 8), demand range from random 5:20 to random 50:200, common-length probability 0.3-0.6, waste factor 0.05-0.15.
  • target_variables <= 1000: n_piece_types from 8:min(50, max(8, target_variables / 20)), stock length from Uniform(6, 12), demand range from random 20:100 to random 200:1000, common-length probability 0.4-0.7, waste factor 0.03-0.10.
  • larger targets: n_piece_types from 20:min(200, max(20, target_variables / 50)), stock length from Uniform(8, 20), demand range from random 100:500 to random 1000:10000, common-length probability 0.5-0.8, waste factor 0.02-0.08.

Piece lengths are either near common sizes, with Normal(0, 0.02) variation, or sampled from a transformed Beta(2, 3) distribution between 0.1 and about 95 percent of stock length. Lengths are rounded to precision 0.05 for smaller stock and 0.1 for stock over 10. Duplicate lengths are removed, so actual piece-type count may shrink.

Demands are sampled from lognormal distributions. Common lengths use LogNormal(log((demand_min + demand_max) / 1.3), 0.5); other lengths use LogNormal(log((demand_min + demand_max) / 2.0), 0.7). Demands are rounded to coarser increments as they grow and clamped to the selected range.

The struct stores:

  • piece_lengths
  • demands
  • patterns, where patterns[p][i] is the count of piece type i produced by pattern p
  • stock_length
  • stock_limit, where 0 means unlimited stock pieces

The constructor calls Random.seed!(seed), resetting Julia's global RNG.

LP Formulation

Sets:

  • P = {1, ..., number of patterns} cutting patterns
  • I = {1, ..., number of piece types} required piece lengths

Decision variable:

  • x_p >= 0: number of times pattern p is used

Objective:

\[\min \sum_{p \in P} x_p\]

Demand satisfaction:

\[\sum_{p \in P} a_{pi} x_p \ge d_i \quad \forall i \in I\]

Optional stock limit:

\[\sum_{p \in P} x_p \le S \quad \text{if } S > 0\]

Bounds are nonnegativity only. Although cutting stock is naturally integer, x_p is continuous in the implemented model, so this is the LP relaxation of the pattern-count problem.

Feasibility Controls

The constructor maps the requested status to a boolean target:

  • feasible: target feasible.
  • infeasible: target infeasible.
  • unknown: randomly chooses feasible or infeasible with probability 0.5 each.

For feasible targets, the generator creates patterns using generate_cutting_patterns, copies base demands, perturbs each demand by Uniform(0.8, 1.2), and sets stock_limit = 0 for unlimited stock. Pattern generation always starts with single-piece patterns for each piece type, so every piece type has at least one direct production route as long as each length fits in stock.

For infeasible targets, one of three methods is selected:

  • Single piece impossibility: picks the smallest piece type, estimates its best pattern efficiency, sets a finite stock limit, and raises that piece's demand above maximum possible production under the stock limit.
  • No-pattern scenario: removes every pattern containing one target piece type while keeping positive demand for it, with unlimited stock. That piece cannot be produced at all.
  • Combined stock and demand contradiction: generates normal patterns, estimates a lower bound on stock needed, sets stock limit to 40-70 percent of that bound, and scales demands upward.

Infeasible demand scaling is influenced by scenario labels such as rush order, seasonal spike, backlog clearing, and mixed, with scaling factors generally between about 0.8 and 4.0 depending on length commonality and scenario.

Model Characteristics

Variable count is length(patterns), not necessarily exactly target_variables. Constraint count is one row per piece type plus one stock-limit row when stock_limit > 0. The pattern matrix is sparse because each pattern uses only a subset of piece types. Single-piece patterns are very sparse; generated complex patterns have a small random number of selected piece types, biased toward shorter pieces.

The model is continuous even though the real cutting decision is integer. Fractional pattern usage is allowed by the built JuMP model.

Practical Notes

This generator is useful for testing column-style covering LPs, sparse nonnegative matrices, and infeasibility from missing coverage or aggregate resource limits. It does not generate patterns by solving a pricing problem; it samples a fixed pattern list up front. Because duplicate pattern rejection and unique piece lengths can shrink dimensions, expect actual model size to differ from the target, especially for small targets or repeated common lengths.

← All generators
Assignment & Scheduling

Scheduling

Scheduling generates a binary workforce scheduling model that assigns workers to shifts while minimizing cost and satisfying staffing, availability, workload, and consecutive-day constraints.

Objective: Min Binary variables

Overview

This generator represents workforce scheduling for settings such as stores, restaurants, clinics, hospitals, call centers, airlines, and manufacturing operations. It assigns workers to shifts across a multi-day horizon, respecting worker availability, at most one shift per day, minimum and maximum total shifts, and a maximum consecutive working-days rule.

The generator can optionally create skill-based availability: workers have skills, shifts require skills, and availability is removed when a worker lacks a required skill. The build model itself uses the resulting availability matrix; it does not add separate skill constraints.

Generator Data and Sizing

target_variables is intended to approximate n_workers * n_shifts * n_days, but the constructor samples dimensions by scale rather than solving exactly for the requested count.

For target_variables <= 250:

  • n_workers: rounded Uniform(4, 12).
  • n_shifts: rounded Uniform(2, 4).
  • n_days: rounded Uniform(3, 7).
  • min_staffing: rounded Uniform(1, 2).
  • max_staffing: min_staffing + rounded Uniform(1, 3).
  • availability_density: Beta(8, 2).
  • min_worker_shifts: rounded Uniform(2, 4).
  • max_worker_shifts: min_worker_shifts + rounded Uniform(1, 3).
  • max_consecutive_shifts: rounded Uniform(2, 3).
  • min_cost: Uniform(15, 25).
  • max_cost: min_cost + Uniform(20, 40).
  • skill_based: true with probability 0.2.

For 250 < target_variables <= 1000:

  • n_workers: rounded Uniform(8, 25).
  • n_shifts: rounded Uniform(3, 6).
  • n_days: rounded Uniform(5, 14).
  • min_staffing: rounded Uniform(2, 4).
  • max_staffing: min_staffing + rounded Uniform(2, 5).
  • availability_density: Beta(6, 3).
  • min_worker_shifts: rounded Uniform(3, 5).
  • max_worker_shifts: min_worker_shifts + rounded Uniform(2, 4).
  • max_consecutive_shifts: rounded Uniform(2, 4).
  • min_cost: Uniform(20, 35).
  • max_cost: min_cost + Uniform(25, 60).
  • skill_based: true with probability 0.4.

For target_variables > 1000:

  • n_workers: rounded Uniform(25, 80).
  • n_shifts: rounded Uniform(4, 8).
  • n_days: rounded Uniform(7, 30).
  • min_staffing: rounded Uniform(3, 6).
  • max_staffing: min_staffing + rounded Uniform(3, 8).
  • availability_density: Beta(4, 3).
  • min_worker_shifts: rounded Uniform(4, 6).
  • max_worker_shifts: min_worker_shifts + rounded Uniform(2, 5).
  • max_consecutive_shifts: rounded Uniform(3, 5).
  • min_cost: Uniform(25, 50).
  • max_cost: min_cost + Uniform(30, 100).
  • skill_based: true with probability 0.6.

If skill-based scheduling is enabled:

  • n_skills is sampled as 2:3, 3:5, or 4:8 depending on scale.
  • Each worker receives between one and min(3, n_skills) skills.
  • Each shift requires one skill.
  • Worker-shift availability is zeroed out when the worker does not have any required skill for that shift.

Generated data:

  • total_shifts = n_shifts * n_days.
  • staffing_req[s]: generated from a Poisson mean affected by shift-in-day peak factor and weekend factor, then clamped to [min_staffing, max_staffing].
  • availability[w, s]: binary availability generated from full-time or part-time worker patterns, shift timing, and weekend adjustments.
  • costs[w, s]: generated from worker tier (junior, regular, senior), optional skill premium, shift timing premiums, weekend/month-end premiums, and log-normal noise.

The stored struct fields are:

  • n_workers::Int
  • n_shifts::Int
  • n_days::Int
  • total_shifts::Int
  • staffing_req::Vector{Int}
  • availability::Matrix{Int}
  • costs::Matrix{Float64}
  • min_worker_shifts::Int
  • max_worker_shifts::Int
  • max_consecutive_shifts::Int
  • skill_based::Bool
  • worker_skills::Union{Matrix{Int}, Nothing}
  • shift_skill_req::Union{Matrix{Int}, Nothing}

The constructor calls Random.seed!(seed), making generated instances reproducible for the same inputs while resetting Julia's global RNG.

LP Formulation

Sets and indices:

  • Workers w in W = {1, ..., n_workers}.
  • Shifts s in S = {1, ..., total_shifts}.
  • Days d in D = {1, ..., n_days}.
  • S_d: shifts belonging to day d.

Decision variables:

x_{w,s} in {0, 1}    1 if worker w is assigned to shift s, 0 otherwise

Objective:

\[\min \sum_{w \in W} \sum_{s \in S} cost_{w,s} x_{w,s}\]

Staffing requirements:

\[\sum_{w \in W} x_{w,s} \ge staffing\_req_s \quad \forall s \in S\]

Availability constraints are added for unavailable worker-shift pairs:

\[x_{w,s} = 0 \quad \text{if availability}_{w,s} = 0\]

At most one shift per worker per day:

\[\sum_{s \in S_d} x_{w,s} \le 1 \quad \forall w \in W, d \in D\]

Minimum and maximum shifts per worker:

\[\sum_{s \in S} x_{w,s} \ge min\_worker\_shifts \quad \forall w \in W\]
\[\sum_{s \in S} x_{w,s} \le max\_worker\_shifts \quad \forall w \in W\]

Maximum consecutive working days:

If max_consecutive_shifts >= 1 and n_days > max_consecutive_shifts, the model creates windows of length max_consecutive_shifts + 1 and enforces:

\[\sum_{s \in window} x_{w,s} \le max\_consecutive\_shifts\]

for each worker and each rolling day window. Since there is also at most one shift per day, this prevents working more than max_consecutive_shifts days in any such window.

Interpretation: the model finds the minimum-cost assignment of available workers to required shifts while respecting worker workload rules.

Feasibility Controls

For feasible, the generator applies a constructive feasibility process:

  1. Converts shift availability into day availability per worker.
  2. Computes each worker's maximum assignable days under the consecutive-day rule.
  3. Reduces min_worker_shifts to at most the minimum worker capacity and keeps max_worker_shifts consistent.
  4. Caps per-shift staffing by available worker counts with beta-distributed slack.
  5. Caps per-day total demand by available day workers with reserve.
  6. Caps global demand by total worker capacity with reserve.
  7. Uses randomized greedy matching to assign workers to shifts while respecting availability, at most one shift per day, maximum shifts, and consecutive-day windows.
  8. Reduces unmet staffing requirements to the covered amounts.
  9. Tries to bring each worker up to min_worker_shifts, freeing demand slots when possible. If a worker cannot be brought up to the minimum, the global minimum is lowered to that worker's assigned count.

The final staffing_req is aligned to a constructed assignment, and worker minimums are adjusted if necessary.

For infeasible, the generator randomly chooses one of three modes:

  • shift_blackout: selects one or two high-demand shifts, ensures each has positive demand, and sets all worker availability for those shifts to zero.
  • day_overload: picks a day and raises total demand beyond the number of workers available on that day.
  • min_over_cap: computes each worker's capacity under availability, max_worker_shifts, and consecutive-day rules, then sets min_worker_shifts above the minimum worker capacity.

For unknown, no special feasibility or infeasibility branch is run. The initially sampled staffing, availability, costs, and worker limits are used as-is, so feasibility depends on the random draw.

Model Characteristics

Variable count:

n_workers * total_shifts

Constraint count drivers:

  • total_shifts staffing constraints.
  • One equality constraint for each unavailable worker-shift pair.
  • n_workers * n_days at-most-one-shift-per-day constraints.
  • 2 * n_workers minimum/maximum workload constraints.
  • n_workers * (n_days - max_consecutive_shifts) rolling-window constraints when the consecutive-day rule applies.

Availability constraints can dominate the constraint count when availability is sparse, especially after skill filtering.

This is a binary integer model, not a continuous LP. A continuous relaxation would use 0 <= x_{w,s} <= 1, but the implementation declares Bin.

Practical Notes

This generator is useful for workforce assignment and scheduling benchmarks with rich combinatorial structure. The requested feasible path is unusually involved and actively modifies staffing requirements and minimum shifts to match a constructed schedule.

The field name max_consecutive_shifts is implemented as a maximum number of consecutive working days, not consecutive shifts. Skill requirements are enforced indirectly by modifying availability; the model builder does not use worker_skills or shift_skill_req after construction.

← All generators
Selection & Finance

Knapsack

The knapsack generator creates fractional continuous knapsack LPs that maximize selected item value under a capacity limit and optional minimum-value requirement.

Objective: Max Continuous variables

Overview

This generator represents selecting portions of items to fit within a weight budget. Each item has a value and weight, and the decision variable can take any value from 0 to 1. Despite the common integer interpretation of knapsack, this implementation is explicitly fractional.

Generator Data and Sizing

target_variables maps directly to the item count:

target_variables = n_items

The constructor seeds Julia's global RNG with Random.seed!(seed), so item values, weights, capacity, and infeasibility threshold are reproducible for the same inputs.

Size-dependent value and weight ranges are sampled in two stages. First the low and high endpoints are drawn, then item data is sampled uniformly from the resulting integer intervals:

Target variablesValue range endpointsWeight range endpoints
<= 100low from 5:20, high from 80:150low from 3:8, high from 15:25
<= 1000low from 10:30, high from 100:300low from 5:15, high from 20:40
> 1000low from 20:50, high from 200:500low from 10:25, high from 30:60

Capacity is based on total item weight:

capacity_ratio = 0.3 + rand() * 0.4
capacity = round(Int, sum(weights) * capacity_ratio)
capacity = max(1, capacity + rand(-50:50))

The struct stores:

  • n_items::Int
  • capacity::Int
  • values::Vector{Int}
  • weights::Vector{Int}
  • min_value::Float64

The docstring lists the first four fields, but the struct also includes min_value, which is used to encode infeasible instances.

LP Formulation

Set:

  • I = {1, ..., n_items}

Decision variable:

0 <= x[i] <= 1 = fraction of item i selected

Objective:

maximize sum_{i in I} values[i] * x[i]

Capacity constraint:

sum_{i in I} weights[i] * x[i] <= capacity

Optional minimum-value constraint:

sum_{i in I} values[i] * x[i] >= min_value

The minimum-value constraint is only added when min_value > 0.

Feasibility Controls

The constructor first samples item data and capacity. It initializes min_value = 0.0.

  • feasible: leaves min_value at 0.0, so the all-zero selection is feasible.
  • infeasible: computes the fractional knapsack optimum by sorting items by value-to-weight ratio and greedily filling capacity. It then sets min_value to 110% to 140% of that computed maximum achievable value, making the capacity and minimum-value constraints inconsistent.
  • unknown: randomly maps to feasible with probability 0.7 or infeasible with probability 0.3, then follows the corresponding behavior.

The infeasibility control is exact for the fractional model because the greedy value-to-weight ordering solves the continuous knapsack relaxation.

Model Characteristics

The model has n_items continuous variables with lower bound 0 and upper bound 1. It always has one capacity constraint and may have one minimum-value constraint. The objective and constraints are dense across items.

This is not a binary knapsack model. It models fractional item selection, so it is an LP relaxation of the common 0-1 knapsack problem. The feasible mode is trivially feasible because choosing no items satisfies the capacity constraint and there is no positive value requirement.

Practical Notes

These instances are useful for testing bounded-variable LPs with dense rows and controlled infeasibility. They are not appropriate when the desired benchmark is combinatorial knapsack hardness, unless the package-level workflow separately imposes or preserves integrality. The implementation's total_avg_weight variable is actually the sum of sampled weights.

← All generators
Selection & Finance

Portfolio

Generates CVaR portfolio optimization LPs with allocation, exposure, position, and turnover constraints.

Objective: Max Continuous variables

Overview

This generator represents institutional portfolio construction. It chooses asset weights to maximize expected return while limiting Conditional Value-at-Risk (CVaR), sector and region concentration, asset-class allocations, factor exposures, position sizes, and turnover from a benchmark.

Generator Data and Sizing

target_variables is interpreted as the full LP variable count:

total variables = n_assets + n_scenarios + 1 + n_assets + n_assets
                = 3 * n_assets + n_scenarios + 1

The constructor sets:

n_assets = max(3, floor(target_variables / 5))
n_scenarios = max(n_assets, target_variables - 3 * n_assets - 1)

Scale-dependent group counts:

Scale conditionSectorsRegionsAsset classesFactors
target_variables <= 1003-63-43-43-5
target_variables <= 5005-103-53-54-7
otherwise8-124-64-55-8

Each group count is capped at n_assets. Sector, region, and asset-class assignments are balanced so each group receives at least one asset.

Random data generation:

  • Factor loadings are Normal(0, 0.3) with an added sector-correlated primary factor loading in Uniform(0.3, 0.7).
  • Scenario factor returns are Normal(0, 0.05) and idiosyncratic returns are Normal(0, 0.02).
  • Scenario returns are generated by a factor model plus idiosyncratic noise.
  • Expected returns are scenario means plus Normal(0, 0.01) noise.
  • CVaR confidence beta is Uniform(0.90, 0.99).
  • Initial cvar_limit is based on equal-weight tail losses and multiplied by Uniform(0.8, 1.5).
  • Sector and region upper limits scale group weights by Uniform(1.2, 2.5) and are capped at 1.
  • Asset-class lower bounds scale group weights by Uniform(0.1, 0.5); upper bounds scale by Uniform(1.2, 2.5) and are at least lower + 0.05.
  • Factor bounds are centered around equal-weight exposure with random margins of 0.1-0.3.
  • Position limits are sampled between max(2 / n_assets, 0.02) and min(0.3, 5 / n_assets), then scaled up if their sum is below 1.05.
  • Benchmark weights are normalized log-normal samples.
  • Turnover limit is Uniform(0.3, 1.5).

The stored struct fields are:

  • n_assets
  • n_scenarios
  • n_sectors
  • n_regions
  • n_asset_classes
  • n_factors
  • expected_returns
  • scenario_returns
  • beta
  • cvar_limit
  • sector_assignments
  • region_assignments
  • asset_class_assignments
  • sector_upper
  • region_upper
  • asset_class_lower
  • asset_class_upper
  • factor_loadings
  • factor_lower
  • factor_upper
  • max_position
  • benchmark
  • turnover_limit

The constructor calls Random.seed!(seed), so generation is reproducible for a fixed seed but resets Julia's global RNG state.

LP Formulation

Sets and indices:

  • I = {1, ..., n_assets}: assets.
  • Omega = {1, ..., n_scenarios}: return scenarios.
  • K: sectors.
  • G: regions.
  • C: asset classes.
  • F: risk factors.

Decision variables:

x_i >= 0
z_omega >= 0
alpha free
d_plus_i >= 0
d_minus_i >= 0

x_i is the portfolio weight. z_omega is the CVaR shortfall auxiliary for scenario omega. alpha is the VaR threshold. d_plus_i and d_minus_i linearize turnover from the benchmark.

Objective:

maximize sum_{i in I} expected_return_i x_i

Constraints:

CVaR shortfall:

z_omega >= -sum_{i in I} scenario_return_{omega,i} x_i - alpha
    for each omega in Omega

CVaR limit:

alpha + (1 / ((1 - beta) * |Omega|)) * sum_{omega in Omega} z_omega <= cvar_limit

Fully invested budget:

sum_{i in I} x_i = 1

Sector and region upper bounds:

sum_{i in sector k} x_i <= sector_upper_k    for each k
sum_{i in region g} x_i <= region_upper_g    for each g

Asset-class lower and upper bounds:

asset_class_lower_c <= sum_{i in class c} x_i <= asset_class_upper_c
    for each c

Factor exposure bounds:

factor_lower_f <= sum_{i in I} factor_loading_{i,f} x_i <= factor_upper_f
    for each f

Position limits:

x_i <= max_position_i    for each i

Turnover decomposition and limit:

d_plus_i - d_minus_i = x_i - benchmark_i    for each i
sum_{i in I} (d_plus_i + d_minus_i) <= turnover_limit

Bounds:

x, z, d_plus, and d_minus are continuous and nonnegative. alpha is free.

Feasibility Controls

  • feasible: The generator constructs a reference portfolio from the benchmark, caps it by 90% of position limits, renormalizes it, and then widens many constraints around that reference. It raises sector and region upper limits as needed, adjusts asset-class bounds around the reference class weights, widens factor exposure bounds around reference exposures, raises the CVaR limit above reference CVaR, raises turnover above reference turnover, and rescales position limits if their sum is too small. Because the renormalization step can push an individual weight back above its original position limit and the code does not re-cap every position afterward, this path is best read as a strong feasibility heuristic rather than a formal witness for every sampled instance.
  • infeasible: One of four modes is sampled:
    • Tight CVaR: cvar_limit is multiplied by Uniform(0.01, 0.15).
    • Asset-class lower bounds: lower bounds are scaled so their sum is greater than 1.
    • Position limits: maximum positions are scaled so their sum is below 1.
    • Turnover and sectors: turnover is set near zero and sector upper bounds are tightened below benchmark sector weights.
  • unknown: The generator samples feasible behavior with probability 0.7; otherwise it samples one infeasible mode.

Model Characteristics

  • Variables: 3 * n_assets + n_scenarios + 1.
  • Constraints: n_scenarios CVaR shortfall rows, one CVaR limit, one budget equality, up to n_sectors sector rows, up to n_regions region rows, two rows per asset class, two rows per factor, n_assets position rows, n_assets turnover decomposition equalities, and one turnover limit.
  • Density: scenario CVaR rows touch all asset weights plus one scenario auxiliary and alpha; factor rows touch all asset weights; group rows touch subsets.
  • Model class: continuous LP. The CVaR and absolute-turnover terms are linearized through auxiliary variables.

Practical Notes

These instances are useful for testing dense scenario blocks, free variables, exposure rows, and many overlapping portfolio policy constraints. The objective maximizes expected return while risk is handled as a constraint, so infeasible modes often show up as conflicting policy/risk bounds rather than unboundedness.

← All generators
Selection & Finance

Project Selection

Project selection generates a binary portfolio-optimization model that selects projects to maximize return subject to budget, risk, dependency, and high-risk-count constraints.

Objective: Max Binary variables

Overview

This generator represents capital budgeting or project portfolio selection. Each project has a cost, expected return, risk score, and possible dependencies on other projects. The optimizer chooses which projects to fund.

Although this package is oriented around LP generation, this specific model is a mixed-integer/binary optimization problem because each project is either selected or not selected.

Generator Data and Sizing

target_variables maps directly to projects:

n_projects = target_variables
projects = collect(1:n_projects)

There is no lower or upper clamp in this generator.

Scale-dependent parameters are sampled as follows.

For target_variables <= 250:

  • Cost scale around log(100_000) with standard deviation 1.5.
  • min_cost: at least 5_000, rounded from a log-normal sample.
  • max_cost: at most 500_000, rounded from a log-normal sample.
  • return_multiplier: Uniform(1.5, 4.0).
  • min_return: at least 10_000, based on min_cost * Uniform(0.8, 1.2).
  • max_return: at most 1_000_000, based on max_cost * return_multiplier.
  • budget_factor: Beta(2, 3) * 0.4 + 0.2.
  • max_risk_score: Uniform(8.0, 12.0).
  • dependency_density: Beta(2, 8) * 0.1 + 0.05.

For 250 < target_variables <= 1000:

  • Cost scale around log(500_000) with standard deviation 1.8.
  • min_cost: at least 10_000.
  • max_cost: at most 5_000_000.
  • return_multiplier: Gamma(2, 2) + 2.0.
  • min_return: at least 50_000, based on min_cost * Uniform(0.7, 1.3).
  • max_return: at most 10_000_000, based on max_cost * return_multiplier.
  • budget_factor: Beta(3, 4) * 0.35 + 0.15.
  • max_risk_score: Uniform(12.0, 18.0).
  • dependency_density: Beta(3, 7) * 0.15 + 0.1.

For target_variables > 1000:

  • Cost scale around log(2_000_000) with standard deviation 2.0.
  • min_cost: at least 50_000.
  • max_cost: at most 50_000_000.
  • return_multiplier: Gamma(1.5, 2.5) + 1.5.
  • min_return: at least 100_000, based on min_cost * Uniform(0.6, 1.4).
  • max_return: at most 100_000_000, based on max_cost * return_multiplier.
  • budget_factor: Beta(4, 6) * 0.3 + 0.1.
  • max_risk_score: Uniform(15.0, 25.0).
  • dependency_density: Beta(2, 5) * 0.15 + 0.15.

If sampled minimums exceed maximums, the generator repairs them:

  • min_cost = max_cost * 0.3 if min_cost >= max_cost.
  • min_return = max_return * 0.4 if min_return >= max_return.

Additional generated data:

  • risk_budget = n_projects * Uniform(0.8, 2.5).
  • max_high_risk_projects = max(1, ceil(Int, n_projects * high_risk_fraction)), where high_risk_fraction = Beta(2, 5) * 0.2 + 0.1.
  • high_risk_threshold = max_risk_score * Uniform(0.6, 0.8).
  • Project category is sampled from low-, medium-, and high-risk/return categories with weights 0.4, 0.4, and 0.2.
  • costs[p] are log-normal-like values between min_cost and max_cost.
  • returns[p] are correlated with cost through a sampled ROI and category noise, then clamped to [min_return, max_return].
  • risk_scores[p] are beta-distributed based on return percentile, with normal noise, and clamped to [1.0, max_risk_score].
  • dependencies are generated among projects sorted by cost. A dependency (p1, p2) means project p1 depends on project p2.
  • budget = sum(costs) * budget_factor.
  • min_selected is normally zero and is only set by the infeasible branch.

The stored struct fields are:

  • n_projects::Int
  • projects::Vector{Int}
  • costs::Dict{Int,Float64}
  • returns::Dict{Int,Float64}
  • risk_scores::Dict{Int,Float64}
  • dependencies::Vector{Tuple{Int,Int}}
  • budget::Float64
  • risk_budget::Float64
  • max_high_risk_projects::Int
  • high_risk_threshold::Float64
  • min_selected::Int

The constructor calls Random.seed!(seed), so generation is reproducible for identical inputs and resets Julia's global RNG.

LP Formulation

Sets and indices:

  • Projects p in P.
  • Dependencies (p1, p2) in D, where p1 requires p2.
  • High-risk projects H = {p in P : risk_score_p > high_risk_threshold}.

Decision variables:

x_p in {0, 1}    1 if project p is selected, 0 otherwise

Objective:

\[\max \sum_{p \in P} return_p x_p\]

Budget constraint:

\[\sum_{p \in P} cost_p x_p \le budget\]

Risk budget constraint:

\[\sum_{p \in P} risk_p x_p \le risk\_budget\]

Dependency constraints:

\[x_{p1} \le x_{p2} \quad \forall (p1, p2) \in D\]

High-risk project count, added only if H is nonempty:

\[\sum_{p \in H} x_p \le max\_high\_risk\_projects\]

Minimum selection constraint, added only if min_selected > 0:

\[\sum_{p \in P} x_p \ge min\_selected\]

Interpretation: the model chooses the best-return portfolio that fits aggregate budget and risk limits, respects prerequisite projects, and caps exposure to high-risk projects.

Feasibility Controls

The constructor sets actual_status = feasibility_status. If feasibility_status == unknown, it randomly chooses feasible with probability 0.7 and infeasible with probability 0.3.

For feasible, no extra feasibility repair is applied. Since min_selected = 0, the all-zero portfolio is feasible: it satisfies budget, risk, dependency, and high-risk constraints.

For infeasible, the generator:

  1. Sorts projects by increasing cost.
  2. Counts how many cheapest projects can fit within the budget.
  3. Sets min_selected to more than that affordable count:
       min_selected = affordable + max(1, rand(1:max(1, div(n_projects, 4))))
       min_selected = min(min_selected, n_projects)

The intended infeasibility is that the model requires selecting more projects than the budget can support, even when choosing the cheapest available projects.

For unknown, the sampled actual_status follows the same feasible or infeasible path above.

Model Characteristics

Variable count:

n_projects

Constraint count drivers:

  • One budget constraint.
  • One risk budget constraint.
  • One constraint per generated dependency.
  • Optional one high-risk-count constraint.
  • Optional one minimum-selected constraint.

Dependency density grows from pairwise checks over projects sorted by cost, so large instances can generate many dependency constraints.

This is not a continuous LP as built. The JuMP model declares binary variables with Bin. A continuous relaxation would replace x_p in {0, 1} with 0 <= x_p <= 1, but the implementation does not do that.

Practical Notes

This generator is useful for binary portfolio benchmarks with correlated cost, return, risk, and prerequisite structure. It is also a useful caveat case in an LP-focused collection because the generated model is a MIP, not a pure LP.

The constructor does not clamp target_variables; if target_variables is zero or negative, projects can be empty or malformed for downstream assumptions. Normal use should pass a positive project count.

← All generators
Land & Agriculture

Crop Planning

Generates agricultural land-allocation LPs that maximize crop profit subject to land, water, labor, market, minimum-area, and optional diversity constraints.

Objective: Max Continuous variables

Overview

This generator represents farm or regional crop planning. A planner allocates hectares across crop options with different yields, prices, production costs, water needs, labor needs, market limits, and crop-type diversity requirements. The model chooses continuous planted area for each crop.

Generator Data and Sizing

target_variables is interpreted directly as the number of crop variables:

n_crops = max(2, target_variables)

Scale-dependent ranges:

Scale conditionTotal landWater availability factorLabor availability factorMarket demand factorDiversity probability
target_variables <= 25050-500 ha0.6-0.80.7-0.91.0-1.30.5-0.8
target_variables <= 1000500-5000 ha0.65-0.850.75-0.951.1-1.50.6-0.9
otherwise5000-50000 ha0.7-0.90.8-1.01.2-2.00.7-0.95

The first 25 crops use fixed names and types:

  • 1-5: cereals: Wheat, Corn, Rice, Barley, Oats.
  • 6-10: vegetables: Tomatoes, Peppers, Lettuce, Carrots, Onions.
  • 11-15: legumes: Soybeans, Peas, Lentils, Beans, Chickpeas.
  • 16-20: industrial crops: Cotton, Sugarcane, Tobacco, Hemp, Flax.
  • 21-25: oilseeds: Sunflower, Canola, Safflower, Sesame, Peanuts.

Additional crops are named Crop_i and assigned a random type from :cereal, :vegetable, :legume, :industrial, and :oilseed.

Random data generation:

  • Yields are type-specific log-normal samples with clamps, e.g. cereals 3-10 tons/ha, vegetables 15-40, legumes 2-4, industrial 3-8, oilseeds 1.5-4.
  • Prices are type-specific log-normal samples with clamps, e.g. cereals 150-350 dollars/ton, vegetables 400-900, legumes 300-600, industrial 500-1200, oilseeds 400-700.
  • Production costs are type-specific normal samples with clamps, e.g. cereals 400-900 dollars/ha and vegetables 1200-2500.
  • Water requirements are type-specific normal samples with clamps; Rice and Sugarcane receive special high-water ranges.
  • Labor requirements are type-specific gamma samples with clamps.
  • Net profit per hectare is prices .* yields .- production_costs.
  • Market demand limits are sampled as total_land * Uniform(0.1, 0.4) * market_demand_factor.
  • With probability 0.85, minimum area requirements are created for 30%-50% of crops, preferring cereals and legumes. Minimums are 2%-8% of total land, capped by market demand, and scaled down if their sum exceeds total land.
  • Optional diversity constraints require 5%-15% of total land in crop-type groups with at least two crops.

The stored struct fields are:

  • n_crops
  • total_land
  • crop_types
  • crop_names
  • yields
  • prices
  • production_costs
  • water_requirements
  • labor_requirements
  • net_profit_per_ha
  • market_demand_limits
  • min_area_per_crop
  • water_capacity
  • labor_capacity
  • diversity_constraints

The constructor calls Random.seed!(seed), so generation is reproducible for a fixed seed but resets Julia's global RNG state.

LP Formulation

Sets and indices:

  • I = {1, ..., n_crops}: crops.
  • D: optional diversity constraints, each containing a crop type, minimum area, and crop index set.

Decision variables:

x_i >= 0

x_i is hectares allocated to crop i.

Objective:

maximize sum_{i in I} (price_i * yield_i - production_cost_i) x_i

Constraints:

Land:

sum_{i in I} x_i <= total_land

Water:

sum_{i in I} water_requirement_i x_i <= water_capacity

Labor:

sum_{i in I} labor_requirement_i x_i <= labor_capacity

Market demand:

x_i <= market_demand_limit_i    for each i in I

Minimum area, only where min_area_per_crop[i] > 0:

x_i >= min_area_i

Diversity, for each generated tuple (crop_type, min_type_area, crop_indices):

sum_{i in crop_indices} x_i >= min_type_area

Bounds:

All variables are continuous and nonnegative.

Feasibility Controls

  • feasible: The generator constructs a baseline allocation by first satisfying minimum crop areas, then distributing remaining land by nonnegative profit weights subject to market limits. Water and labor capacities are set to baseline usage times a slack factor from 1.1 to 1.3, and also at least 1.2 times the resource usage of minimum areas. Diversity constraints are added only when the baseline allocation satisfies them within a 5% tolerance.
  • infeasible: The generator computes water and labor lower bounds from minimum crop areas plus an assumed allocation of all remaining land to the least-resource crop. It then randomly makes either water or labor capacity 75% to 95% of that lower bound, while giving the other resource 10% to 40% slack. Diversity constraints are not added in infeasible mode.
  • unknown: Water and labor capacities are sampled from estimated average use times scale-specific availability factors and additional Uniform(0.6, 1.4) noise. Diversity constraints may be added without a feasibility check.

Model Characteristics

  • Variables: n_crops.
  • Constraints: one land row, one water row, one labor row, n_crops market upper-bound rows, up to n_crops minimum-area rows, and optional diversity rows.
  • Density: land, water, and labor rows touch all crop variables; market and minimum-area rows are singleton bounds represented as constraints; diversity rows touch crop subsets by type.
  • Model class: continuous LP.

Practical Notes

These instances are useful for testing dense resource rows mixed with many simple bound-like constraints. One implementation caveat is that the infeasible lower-bound calculation assumes all remaining land must be allocated, but the actual land constraint is <= total_land, not equality. If minimum-area requirements alone are small, reducing water or labor below the "use all land" lower bound may not always prove infeasibility of the implemented LP.

← All generators
Land & Agriculture

Land Use

LandUseProblem generates a parcel zoning assignment model with resource capacities, environmental restrictions, minimum zoning counts, and residential-industrial adjacency rules.

Objective: Max Binary variables

Overview

This generator represents land-use planning. A planner assigns each parcel to one zoning type, such as residential, commercial, industrial, agricultural, or conservation. Assignments create economic value from revenue minus development cost, consume infrastructure resources, and may be blocked by environmental restrictions or adjacency rules.

Generator Data and Sizing

target_variables is interpreted as approximately n_parcels * n_zoning_types. The generator first selects zoning and resource counts by size regime, then sets:

n_parcels = max(2, round(Int, target_variables / n_zoning_types))

Size regimes:

  • target_variables <= 250: n_zoning_types from 3:5, n_resources from 3:5, development cost scale 50000:150000, revenue scale 20000:80000, infrastructure capacity factor 0.6-0.8, environmental restriction probability 0.2-0.4.
  • target_variables <= 1000: n_zoning_types from 4:8, n_resources from 4:6, development cost scale 75000:250000, revenue scale 40000:120000, infrastructure capacity factor 0.65-0.85, environmental restriction probability 0.25-0.45.
  • larger targets: n_zoning_types from 5:12, n_resources from 5:8, development cost scale 100000:500000, revenue scale 60000:200000, infrastructure capacity factor 0.7-0.9, environmental restriction probability 0.3-0.5.

Adjacency constraints are enabled with probability 0.8; minimum zoning requirements with probability 0.9.

Parcel sizes follow LogNormal(log(5), 0.8) and are floored at 0.1. Development costs and revenues are generated per parcel-zoning pair using zoning-specific multipliers, parcel-specific gamma location factors, and normal noise. Resource consumption is generated per zoning-resource pair from hand-coded patterns for the first five zoning types and random Uniform(0.5, 3.0) bases for additional types, multiplied by Gamma(2, 0.5).

Initial resource capacities are based on total parcel size times mean consumption, scaled by the infrastructure capacity factor and Uniform(0.8, 1.2) noise. Environmental restrictions randomly forbid some parcel-zoning pairs, but the constructor repairs any parcel with no allowed zoning type. If minimum zoning requirements are active, the first up to three zoning types receive minimum parcel counts, usually about 10 percent of parcels each, adjusted to fit the parcel count. Environmental restrictions are relaxed when needed so enough parcels allow required zoning types.

The struct stores:

  • n_parcels, n_zoning_types, n_resources
  • parcel_sizes
  • development_costs, revenues
  • resource_consumption
  • resource_capacities
  • environmental_restrictions
  • adjacency_matrix
  • zoning_names, resource_names
  • min_counts_by_type
  • zoning_adjacency_constraints
  • minimum_zoning_requirements

The constructor calls Random.seed!(seed), resetting Julia's global RNG.

LP Formulation

The implemented model is a binary assignment MILP, not a pure LP.

Sets:

  • P = {1, ..., n_parcels} parcels
  • Z = {1, ..., n_zoning_types} zoning types
  • R = {1, ..., n_resources} resources

Decision variable:

  • x_{ij} in {0,1}: parcel i is assigned to zoning type j

Objective:

\[\max \sum_{i \in P} \sum_{j \in Z} s_i (rev_{ij} - cost_{ij}) x_{ij}\]

Each parcel receives exactly one zoning type:

\[\sum_{j \in Z} x_{ij} = 1 \quad \forall i \in P\]

Resource capacities:

\[\sum_{i \in P} \sum_{j \in Z} s_i r_{jk} x_{ij} \le C_k \quad \forall k \in R\]

Environmental restrictions:

\[x_{ij} = 0 \quad \text{for restricted parcel-zoning pairs}\]

Minimum zoning requirements, when enabled:

\[\sum_{i \in P} x_{ij} \ge m_j\]

Adjacency constraints, when enabled and at least three zoning types exist, prohibit residential type 1 next to industrial type 3:

\[x_{i,1} + x_{i',3} \le 1\]
\[x_{i,3} + x_{i',1} \le 1\]

for adjacent parcels i and i'.

Feasibility Controls

For feasible, the constructor builds a concrete witness assignment. It computes allowed zoning sets and neighbor lists, then tries to satisfy minimum counts for type 1, type 3, and type 2. Residential type 1 and industrial type 3 assignment is adjacency-aware. If it cannot find enough nonconflicting parcels, it prunes adjacency edges to make the witness possible. Commercial type 2 has additional fallbacks, including swaps and final environmental relaxation when needed.

Remaining parcels are assigned to the allowed zoning type with the best parcel-level net benefit, excluding residential-industrial adjacency conflicts when possible. The witness is then repaired: adjacency edges that conflict with the witness are pruned, minimum zoning count violations are addressed by swaps, and as a last resort minimum requirements are reduced to the achieved count. Finally, resource usage of the witness is computed and capacities are raised to at least usage times a slack factor from 1.05 to 1.25.

For infeasible, the constructor computes a resource lower bound by assigning each parcel its minimum possible consumption for each resource over currently allowed zoning types. It then sets every resource capacity below that lower bound by 5-25 percent, floored at 1e-6. Since every parcel must be assigned exactly one allowed zoning, this creates a resource-capacity contradiction.

For unknown, no feasibility enforcement branch is applied. The base random restrictions, capacities, adjacency matrix, and minimum counts are returned after only the general repairs that ensure each parcel has an allowed zoning and required types are not blocked solely by environmental restrictions.

Model Characteristics

Variable count is n_parcels * n_zoning_types, all binary. Constraint count is driven by:

  • n_parcels assignment equalities
  • n_resources resource capacity inequalities
  • one equality for each restricted parcel-zoning pair
  • up to length(min_counts_by_type) minimum zoning rows
  • two adjacency inequalities for each true ordered adjacency entry when adjacency constraints are active and n_zoning_types >= 3

The assignment and environmental rows are sparse. Resource rows are dense across all parcel-zoning variables. Adjacency rows are sparse two-variable rows. Because the adjacency matrix is symmetric and the model loops over all ordered pairs, each undirected adjacency can produce duplicated directional constraints.

Practical Notes

This generator is useful for testing binary assignment, resource-capacity infeasibility, environmental exclusions, and graph-like adjacency restrictions. The feasible path may modify generated environmental restrictions, minimum counts, and adjacency edges to admit the witness. The built model uses binary variables, so solving it requires MILP support; the LP relaxation would allow fractional parcel-zoning assignments but is not what build_model creates.