Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

10-2025

Abstract

Large Language Models (LLMs) demonstrate remarkable in-context learning capabilities but often struggle with complex, multi-step reasoning. Multi-Agent Debate (MAD) frameworks partially address these limitations by enabling iterative agent interactions. However, they neglect valuable historical insights by treating each new debate independently. In this paper, we propose Memory-Augmented MAD (MeMAD), a parameter-free memory-augmented MAD framework that systematically organizes and reuses past debate transcripts. MeMAD stores structured representations of successful and unsuccessful reasoning attempts enriched with self-reflections and peer feedback. It systematically retrieves them via semantic similarity at inference time to inform new reasoning tasks. Our experiments on challenging mathematical reasoning, scientific question answering, and language understanding benchmarks show that MeMAD achieves significant accuracy gains (up to 3.3% over conventional MAD baselines) without parameter updates. Our findings underscore structured memory as a pivotal mechanism for achieving deeper and more reliable multi-agent reasoning in LLMs. Code is available in ~\url{https://github.com/LSHCoding/MeMAD}.

Discipline

Artificial Intelligence and Robotics

Areas of Excellence

Digital transformation

Publication

Conference on Language Modeling (COLM 2025), Montreal, Canada, October  7-10

First Page

1

Last Page

18

City or Country

USA

Additional URL

https://openreview.net/forum?id=zLbmsdyTiN#discussion

Share

COinS