Publication Type
Journal Article
Version
publishedVersion
Publication Date
12-2007
Abstract
Temporal-Difference–Fusion Architecture for Learning, Cognition, and Navigation (TD-FALCON) is a generalization of adaptive resonance theory (a class of self-organizing neural networks) that incorporates TD methods for real-time reinforcement learning. In this paper, we investigate how a team of TD-FALCON networks may cooperate to learn and function in a dynamic multiagent environment based on minefield navigation and a predator/prey pursuit tasks. Experiments on the navigation task demonstrate that TD-FALCON agent teams are able to adapt and function well in a multiagent environment without an explicit mechanism of collaboration. In comparison, traditional Q-learning agents using gradient-descent-based feedforward neural networks, trained with the standard backpropagation and the resilient-propagation (RPROP) algorithms, produce a significantly poorer level of performance. For the predator/prey pursuit task, we experiment with various cooperative strategies and find that a combination of a high-level compressed state representation and a hybrid reward function produces the best results. Using the same cooperative strategy, the TD-FALCON team also outperforms the RPROP-based reinforcement learners in terms of both task completion rate and learning efficiency.
Keywords
Multiagent cooperative learning, reinforcement learning (RL), self-organizing neural architectures
Discipline
Computer and Systems Architecture | Computer Engineering | Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Volume
37
Issue
6
First Page
1567
Last Page
1580
ISSN
1083-4419
Identifier
10.1109/TSMCB.2007.907040
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
XIAO, Dan and TAN, Ah-hwee.
Self-organizing neural architectures and cooperative learning in a multiagent environment. (2007). IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics. 37, (6), 1567-1580.
Available at: https://ink.library.smu.edu.sg/sis_research/5221
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSMCB.2007.907040