Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
9-2025
Abstract
Recent Deep Reinforcement Learning (DRL) techniques have advanced solutions to Vehicle Routing Problems (VRPs). However, many of these methods focus exclusively on optimizing distance-oriented objectives (i.e., minimizing route length), often overlooking the implicit drivers' preferences for routes. These preferences, which are crucial in practice, are challenging to model using traditional DRL approaches. To address this gap, we propose a preference-based DRL method characterized by its reward design and optimization objective, which is specialized to learn historical route preferences. Our experiments demonstrate that the method aligns generated solutions more closely with human preferences. Moreover, it exhibits strong generalization performance across a variety of instances, offering a robust solution for different VRP scenarios.
Keywords
preference learning, deep reinforcement learning, historical route estimation, vehicle routing problem, driver behavior modeling, reward design, human-centered optimization, route preference alignment, neural combinatorial optimization, generalization
Discipline
Artificial Intelligence and Robotics
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2025)
Identifier
10.24963/ijcai.2025/955
City or Country
Montreal, Canada
Citation
PAN, Boshen; WU, Yaoxin; CAO, Zhiguang; HOU, Yaqing; ZOU, Guangyu; and ZHANG, Qiang.
Preference-based deep reinforcement learning for historical route estimation. (2025). Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2025).
Available at: https://ink.library.smu.edu.sg/sis_research/10558
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.24963/ijcai.2025/955