We consider an online version of the orienteering problem, where stochastic service requests arise during a first time interval from customers located on the nodes of a graph. Every request must be accepted/rejected in real time. Later, a vehicle must visit the accepted customers during a second time interval. Each accepted request implies a prize, depending on the customer, and a service cost, depending on the routing decisions. Moreover, an accepted request implies a reduction of the routing time available for possible future requests. Each acceptance/rejection decision is made to maximize the expected profit, i.e., the difference between expected prices and expected service costs. We formulate the problem as a Markov Decision Process and derive analytical expressions for the transition probabilities and the optimal policy. Since an exact policy computation is intractable, we design and test several heuristic approaches, including static approximation, simple greedy (non-anticipatory) methods, Sample Average Approximation (SAA) of the objective function using Monte Carlo sampling of future events. We perform extensive computational tests on the proposed algorithms and discuss the pros and cons of the different methods.

### A dynamic and probabilistic orienteering problem

#### Abstract

We consider an online version of the orienteering problem, where stochastic service requests arise during a first time interval from customers located on the nodes of a graph. Every request must be accepted/rejected in real time. Later, a vehicle must visit the accepted customers during a second time interval. Each accepted request implies a prize, depending on the customer, and a service cost, depending on the routing decisions. Moreover, an accepted request implies a reduction of the routing time available for possible future requests. Each acceptance/rejection decision is made to maximize the expected profit, i.e., the difference between expected prices and expected service costs. We formulate the problem as a Markov Decision Process and derive analytical expressions for the transition probabilities and the optimal policy. Since an exact policy computation is intractable, we design and test several heuristic approaches, including static approximation, simple greedy (non-anticipatory) methods, Sample Average Approximation (SAA) of the objective function using Monte Carlo sampling of future events. We perform extensive computational tests on the proposed algorithms and discuss the pros and cons of the different methods.
##### Scheda breve Scheda completa Scheda completa (DC)
2021
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: `https://hdl.handle.net/11379/547563`
##### Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

• ND
• 7
• 6