Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

2-2023

Abstract

We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold p ∈ [0,1] over the infinite time horizon in general Lipschitz continuous systems. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems - it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on 3 stochastic non-linear reinforcement learning tasks.

Discipline

Programming Languages and Compilers

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, DC, 2023 February 7-14

Volume

37

First Page

11926

Last Page

11935

Identifier

10.1609/aaai.v37i10.26407

City or Country

Washington, DC

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1609/aaai.v37i10.26407

Share

COinS