Publication Type

Journal Article

Version

acceptedVersion

Publication Date

3-2022

Abstract

Recently, there is an emerging trend to apply deep reinforcement learning to solve the vehicle routing problem (VRP), where a learnt policy governs the selection of next node for visiting. However, existing methods could not handle well the pairing and precedence relationships in the pickup and delivery problem (PDP), which is a representative variant of VRP. To address this challenging issue, we leverage a novel neural network integrated with a heterogeneous attention mechanism to empower the policy in deep reinforcement learning to automatically select the nodes. In particular, the heterogeneous attention mechanism specifically prescribes attentions for each role of the nodes while taking into account the precedence constraint, i.e., the pickup node must precede the pairing delivery node. Further integrated with a masking scheme, the learnt policy is expected to find higher-quality solutions for solving PDP. Extensive experimental results show that our method outperforms the state-of-the-art heuristic and deep learning model, respectively, and generalizes well to different distributions and problem sizes.

Keywords

Reinforcement learning, Routing, Peer-to-peer computing, Heuristic algorithms, Deep learning, Decoding, Decision making, Heterogeneous attention, deep reinforcement learning, pickup and delivery problem

Discipline

Artificial Intelligence and Robotics | Transportation

Research Areas

Intelligent Systems and Optimization

Publication

IEEE Transactions on Intelligent Transportation Systems

Volume

23

Issue

3

First Page

2306

Last Page

2315

ISSN

1524-9050

Identifier

10.1109/TITS.2021.3056120

Publisher

Institute of Electrical and Electronics Engineers

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1109/TITS.2021.3056120

Share

COinS