Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2023
Abstract
We address the problem of coordinating multiple agents in a dynamic police patrol scheduling via a Reinforcement Learning (RL) approach. Our approach utilizes Multi-Agent Value Function Approximation (MAVFA) with a rescheduling heuristic to learn dispatching and rescheduling policies jointly. Often, police operations are divided into multiple sectors for more effective and efficient operations. In a dynamic setting, incidents occur throughout the day across different sectors, disrupting initially-planned patrol schedules. To maximize policing effectiveness, police agents from different sectors cooperate by sending reinforcements to support one another in their incident response and even routine patrol. This poses an interesting research challenge on how to make such complex decision of dispatching and rescheduling involving multiple agents in a coordinated fashion within an operationally reasonable time. Unlike existing Multi-Agent RL (MARL) approaches which solve similar problems by either decomposing the problem or action into multiple components, our approach learns the dispatching and rescheduling policies jointly without any decomposition step. In addition, instead of directly searching over the joint action space, we incorporate an iterative best response procedure as a decentralized optimization heuristic and an explicit coordination mechanism for a scalable and coordinated decision-making. We evaluate our approach against the commonly adopted two-stage approach and conduct a series of ablation studies to ascertain the effectiveness of our proposed learning and coordination mechanisms.
Keywords
Agent-based and Multi-agent Systems, Multi-agent learning Planning and Scheduling, Learning in planning and scheduling, police agents
Discipline
Artificial Intelligence and Robotics | Operations Research, Systems Engineering and Industrial Engineering | Theory and Algorithms
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023: Macao, August 19-25
First Page
153
Last Page
161
ISBN
9781956792034
Identifier
10.24963/ijcai.2023/18
Publisher
AAAI Press
City or Country
Washington, DC
Citation
JOE, Waldy and LAU, Hoong Chuin.
Learning to send reinforcements: Coordinating multi-agent dynamic police patrol dispatching and rescheduling via reinforcement learning. (2023). Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023: Macao, August 19-25. 153-161.
Available at: https://ink.library.smu.edu.sg/sis_research/8103
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.24963/ijcai.2023/18
Included in
Artificial Intelligence and Robotics Commons, Operations Research, Systems Engineering and Industrial Engineering Commons, Theory and Algorithms Commons