Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

4-2025

Abstract

General virtual agents need to handle multimodal observations, master complex action spaces, and self-improve in dynamic, open-domain environments. However, existing environments are often domain-specific and require complex setups, which limits agent development and evaluation in real-world settings. As a result, current evaluations lack in-depth analyses that decompose fundamental agent capabilities. We introduce AgentStudio, a trinity of environments, tools, and benchmarks to address these issues. AgentStudio provides a lightweight, interactive environment with highly generic observation and action spaces, e.g., video observations and GUI/API actions. It integrates tools for creating online benchmark tasks, annotating GUI elements, and labeling actions in videos. Based on our environment and tools, we curate an online task suite that benchmarks both GUI interactions and function calling with efficient auto-evaluation. We also reorganize existing datasets and collect new ones using our tools to establish three datasets: GroundUI, IDMBench, and CriticBench. These datasets evaluate fundamental agent abilities, including GUI grounding, learning from videos, and success detection, pointing to the desiderata for robust, general, and open-ended virtual agents.

Discipline

Artificial Intelligence and Robotics

Areas of Excellence

Digital transformation

Publication

Proceedings of the Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28

First Page

Last Page

City or Country

Singapore

Citation

ZHENG, Longtao; HUANG, Zhiyuan; XUE, Zhenghai; WANG, Xinrun; AN, Bo; and YAN, Shuicheng. AgentStudio: A toolkit for building general virtual agents. (2025). Proceedings of the Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28. 1-42.
Available at: https://ink.library.smu.edu.sg/sis_research/10717

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://openreview.net/forum?id=axUf8BOjnH

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Research Collection School Of Computing and Information Systems

AgentStudio: A toolkit for building general virtual agents

Publication Type

Version

Publication Date

Abstract

Discipline

Areas of Excellence

Publication

First Page

Last Page

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

AgentStudio: A toolkit for building general virtual agents

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Areas of Excellence

Publication

First Page

Last Page

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links