Research Collection School Of Computing and Information Systems

Integrating temporal difference methods and self‐organizing neural networks for reinforcement learning with delayed evaluative feedback

Ah-hwee TAN, Singapore Management UniversityFollow
Ning LU
Dan XIAO

Publication Type

Journal Article

Version

publishedVersion

Publication Date

2-2008

Abstract

This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the optimal actions based on an action selection policy. We have developed TD-FALCON systems using various TD learning strategies and compared their performance in terms of task completion, learning speed, as well as time and space efficiency. Experiments based on a minefield navigation task have shown that TD-FALCON systems are able to learn effectively with both immediate and delayed reinforcement and achieve a stable performance in a pace much faster than those of standard gradient-descent-based reinforcement learning systems.

Keywords

Reinforcement learning, self-organizing neural networks (NNs), temporal difference (TD) methods

Discipline

Computer Engineering | Databases and Information Systems | OS and Networks

Research Areas

Data Science and Engineering

Publication

IEEE Transactions on Neural Networks

Volume

Issue

First Page

230

Last Page

244

ISSN

1045-9227

Identifier

10.1109/TNN.2007.905839

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

TAN, Ah-hwee; LU, Ning; and XIAO, Dan. Integrating temporal difference methods and self‐organizing neural networks for reinforcement learning with delayed evaluative feedback. (2008). IEEE Transactions on Neural Networks. 9, (2), 230-244.
Available at: https://ink.library.smu.edu.sg/sis_research/5237

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/TNN.2007.905839

Download

Find it in your library

Included in

Computer Engineering Commons, Databases and Information Systems Commons, OS and Networks Commons

COinS

Research Collection School Of Computing and Information Systems

Integrating temporal difference methods and self‐organizing neural networks for reinforcement learning with delayed evaluative feedback

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Integrating temporal difference methods and self‐organizing neural networks for reinforcement learning with delayed evaluative feedback

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links