Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
5-2025
Abstract
Current voice agents wait for a user to complete their verbal instruction before responding; yet, this is misaligned with how humans engage in everyday conversational interaction, where interlocutors use multimodal signaling (e.g. nodding, grunting, or looking at referred to objects) to ensure conversational grounding. We designed an embodied VR agent that exhibits multimodal signaling behaviors in response to situated prompts, by turning its head, or by visually highlighting objects being discussed or referred to. We explore how people prompt this agent to design and manipulate the objects in a VR scene. Through a Wizard of Oz study, we found that participants interacting with an agent that indicated its understanding of spatial and action references were able to prevent errors 30% of the time, and were more satisfied and confident in the agent’s abilities. These findings underscore the importance of designing multimodal signaling communication techniques for future embodied agents.
Keywords
situated prompting, multimodal signaling, common ground, human-ai collaboration
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Areas of Excellence
Digital transformation
Publication
CHI '25: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama Japan, April 26 - May 1
First Page
1
Last Page
25
Identifier
10.1145/3706598.3713110
Publisher
ACM
City or Country
New York
Citation
ZHANG, Tianyi; AU YEUNG, Colin; AURELIA, Emily; ONISHI, Yuki; CHULPONGSATORN, Neil; LI, Jiannan; and TANG, Anthony.
Prompting an embodied AI agent: How embodiment and multimodal signaling affects prompting behaviour. (2025). CHI '25: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Yokohama Japan, April 26 - May 1. 1-25.
Available at: https://ink.library.smu.edu.sg/sis_research/10272
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3706598.3713110