Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
3-2024
Abstract
Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals. The T-SciQ approach generates high-quality CoT rationales as teaching signals and is advanced to train much smaller models to perform CoT reasoning in complex modalities. Additionally, we introduce a novel data mixing strategy to produce more effective teaching data samples for simple and complex science question answer problems. Extensive experimental results show that our T-SciQ method achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18%. Moreover, our approach outperforms the most powerful fine-tuned baseline by 4.5%. The code is publicly available at https://github.com/T-SciQ/T-SciQ.
Keywords
Complex problems, High quality, Language model, Language processing, Model signals, Multi-modal, Multimodal chains, Natural languages
Discipline
Artificial Intelligence and Robotics | Databases and Information Systems | Numerical Analysis and Scientific Computing
Publication
Proceedings of the 38th AAAI Conference on Artificial Intelligence: Vancouver, February 20-27
Volume
38
First Page
19162
Last Page
19170
Identifier
10.1609/aaai.v38i17.29884
Publisher
AAAI Press
City or Country
Palo Alto, CA
Citation
WANG, Lei; HU, Yi; HE, Jiabang; XU, Xing; LIU, Ning; LIU, Hui; and SHEN, Heng Tao.
T-SciQ: Teaching multimodal Chain-of-Thought reasoning via large language model signals for science question answering. (2024). Proceedings of the 38th AAAI Conference on Artificial Intelligence: Vancouver, February 20-27. 38, 19162-19170.
Available at: https://ink.library.smu.edu.sg/sis_research/8756
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1609/aaai.v38i17.29884
Included in
Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons
Comments
Wang Lei SIS Phd