Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2024

Abstract

Deep Reinforcement Learning (DRL) policies are vulnerable to adversarial noise in observations, which can have disastrous consequences in safety-critical environments. For instance, a self-driving car receiving adversarially perturbed sensory observations about traffic signs (e.g., a stop sign physically altered to be perceived as a speed limit sign) can be fatal. Leading existing approaches for making RL algorithms robust to an observation-perturbing adversary have focused on (a) regularization approaches that make expected value objectives robust by adding adversarial loss terms; or (b) employing "maximin'' (i.e., maximizing the minimum value) notions of robustness. While regularization approaches are adept at reducing the probability of successful attacks, their performance drops significantly when an attack is successful. On the other hand, maximin objectives, while robust, can be extremely conservative. To this end, we focus on optimizing a well-studied robustness objective, namely regret. To ensure the solutions provided are not too conservative, we optimize an approximation of regret using three different methods. We demonstrate that our methods outperform existing best approaches for adversarial RL problems across a variety of standard benchmarks from literature.

Keywords

Robust Reinforcement Learning, Adversarial Robustness, Regret

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, Auckland, New Zealand, May 6-10

First Page

2633

Last Page

2640

Identifier

10.5555/3635637.3663250

Publisher

ACM

City or Country

New York

Additional URL

https://doi.org/10.5555/3635637.3663250

Share

COinS