Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2025
Abstract
Automating web navigation which aims to build a web agent that follows user instructions to complete tasks like booking flights by interacting with websites, has received increasing attention due to its practical value. Although existing web agents are mostly equipped with visual perception, planning, and memory abilities, their reasoning process are still deviate from human cognition. In this work, we study the human thought pattern to empower agent with more human-like abilities in web navigation. To tackle this problem, we propose a novel multimodal web agent framework called WebExperT, which is designed to emulate the human planning process of “thinking fast and slow” to effectively decompose complex user instructions. Furthermore, WebExperT leverages experiential learning by reflecting from failure for continuously refining planning and decision-making outcomes. Experimental results on the Mind2Web benchmark demonstrate the superiority of WebExperT in both supervised and unsupervised settings.
Discipline
Artificial Intelligence and Robotics | Programming Languages and Compilers
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, Vienna, Austria, 2025 July 27 - August 1
First Page
14232
Last Page
14251
Identifier
10.18653/v1/2025.acl-long.697
Publisher
Association for Computational Linguistics
City or Country
USA
Citation
LUO, Haohao; KUANG, Jiayi; LIU, Wei; SHEN, Ying; LUAN, Jian; and DENG, Yang.
Browsing like human: A multimodal web agent with experiential fast-and-slow thinking. (2025). Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, Vienna, Austria, 2025 July 27 - August 1. 14232-14251.
Available at: https://ink.library.smu.edu.sg/sis_research/10377
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2025.acl-long.697
Included in
Artificial Intelligence and Robotics Commons, Programming Languages and Compilers Commons