Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2023
Abstract
Large Language Models (LLMs) have shown impressive capabilities in various applications, but they still face various inconsistency issues. Existing works primarily focus on the inconsistency issues within a single LLM, while we complementarily explore the inter-consistency among multiple LLMs for collaboration. To examine whether LLMs can collaborate effectively to achieve a consensus for a shared goal, we focus on commonsense reasoning, and introduce a formal debate framework (FORD) to conduct a three-stage debate among LLMs with real-world scenarios alignment: fair debate, mismatched debate, and roundtable debate. Through extensive experiments on various datasets, LLMs can effectively collaborate to reach a consensus despite noticeable inter-inconsistencies, but imbalances in their abilities can lead to domination by superior LLMs. Leveraging a more advanced LLM like GPT-4 as an authoritative judge can boost collaboration performance. Our work contributes to understanding the inter-consistency among LLMs and lays the foundation for developing future collaboration methods. Codes and data are available at https://github.com/WasteWood/FORD.
Discipline
Databases and Information Systems | Programming Languages and Compilers
Research Areas
Data Science and Engineering
Publication
Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10
First Page
1
Last Page
19
Publisher
Association for Computational Linguistics
City or Country
Singapore
Citation
XIONG, Kai; DING, Xiao; CAO, Yixin; LIU, Ting; and QIN, Bing.
Examining the Inter-consistency of large language models: An in-depth analysis via debate. (2023). Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10. 1-19.
Available at: https://ink.library.smu.edu.sg/sis_research/8391
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.