Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
3-2025
Abstract
Training generally capable agents in complex environments is a challenging task that involves identifying the “right” environments at the training stage. Recent research has highlighted the potential of the Unsupervised Environment Design framework, which generates environment instances/levels adaptively at the frontier of the agent’s capabilities using regret measures. While regret approaches have shown promise in generating feasible environments, they can produce difficult environments that are challenging for an RL agent to learn from. This is because regret represents the best-case (upper bound) learning potential and not the actual learning potential of an environment. To address this, we propose an alternative mechanism that employs marginal benefit, focusing on the improvement (in terms of generalized performance) the agent policy gets for a given environment. The advantage of this new mechanism is that it is agent-focused (and not environment focused) and generates the “right” environments depending on the agent’s policy. Additionally, to improve the generalizability of the agent, we introduce a representative state diversity metric that aims to generate varied experiences for the agent. Finally, we provide detailed experimental results and ablation analysis to showcase the effectiveness of our methods. We obtain SOTA results among RL-based environment generation methods.
Discipline
Artificial Intelligence and Robotics
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), Philadelphia, Pennyslvania, February 25 - March 4
Volume
39
First Page
18253
Last Page
18261
Identifier
10.1609/aaai.v39i17.34008
Publisher
AAAI
City or Country
Philadelphia
Citation
LI, Dexun; LI, Wenjun; and VARAKANTHAM, Pradeep.
Marginal benefit driven RL teacher for unsupervised environment design. (2025). Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), Philadelphia, Pennyslvania, February 25 - March 4. 39, 18253-18261.
Available at: https://ink.library.smu.edu.sg/sis_research/10748
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1609/aaai.v39i17.34008