The Split Delivery Vehicle Routing Problem (SDVRP) is a challenging combinatorial optimization problem. Finding optimal solutions is typically more computationally expensive than for the classic Vehicle Routing Problem (VRP). Accurate estimation of the optimal solution value (or total distance) in the SDVRP is helpful for benchmarking heuristics and optimizing logistics decisions in real-world applications. This study develops regression models using a newly generated dataset of 2160 SDVRP instances solved to near-optimality via the SplitILS heuristic. We systematically compare linear regression with more complex machine learning models, including random forests, multilayer perceptrons (MLP), and Kolmogorov–Arnold networks (KAN), to determine whether added model complexity yields meaningful improvements in prediction accuracy. While all models achieve a very high level of estimation accuracy, we find that the improvements from complex models are marginal compared to the simpler linear regression models. To assess generalizability, we further evaluate the models on additional instances of different sizes and different service region shapes, generated using the same methodology. Our findings demonstrate that linear regression is competitive and often preferable for optimal solution value estimation in the SDVRP, providing high accuracy while maintaining computational efficiency, ease of implementation, and model transparency. This result shows that complex optimization problems do not necessarily require complex predictive models and it provides practical guidance to researchers in this area.

Accurately estimating optimal SDVRP solution values using regression models

Bertazzi L.
2026-01-01

Abstract

The Split Delivery Vehicle Routing Problem (SDVRP) is a challenging combinatorial optimization problem. Finding optimal solutions is typically more computationally expensive than for the classic Vehicle Routing Problem (VRP). Accurate estimation of the optimal solution value (or total distance) in the SDVRP is helpful for benchmarking heuristics and optimizing logistics decisions in real-world applications. This study develops regression models using a newly generated dataset of 2160 SDVRP instances solved to near-optimality via the SplitILS heuristic. We systematically compare linear regression with more complex machine learning models, including random forests, multilayer perceptrons (MLP), and Kolmogorov–Arnold networks (KAN), to determine whether added model complexity yields meaningful improvements in prediction accuracy. While all models achieve a very high level of estimation accuracy, we find that the improvements from complex models are marginal compared to the simpler linear regression models. To assess generalizability, we further evaluate the models on additional instances of different sizes and different service region shapes, generated using the same methodology. Our findings demonstrate that linear regression is competitive and often preferable for optimal solution value estimation in the SDVRP, providing high accuracy while maintaining computational efficiency, ease of implementation, and model transparency. This result shows that complex optimization problems do not necessarily require complex predictive models and it provides practical guidance to researchers in this area.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/641508
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact