Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
6-2024
Abstract
Knowledge base question generation (KBQG) aims to generate natural language questions from a set of triplet facts extracted from KB. Existing methods have significantly boosted the performance of KBQG via pre-trained language models (PLMs) thanks to the richly endowed semantic knowledge. With the advance of pre-training techniques, large language models (LLMs) (e.g., GPT-3.5) undoubtedly possess much more semantic knowledge. Therefore, how to effectively organize and exploit the abundant knowledge for KBQG becomes the focus of our study. In this work, we propose SGSH — a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. The framework incorporates “skeleton heuristics”, which provides more finegrained guidance associated with each input to stimulate LLMs to generate optimal questions, encompassing essential elements like the question phrase and the auxiliary verb. More specifically, we devise an automatic data construction strategy leveraging ChatGPT to construct a skeleton training dataset, based on which we employ a soft prompting approach to train a BART model dedicated to generating the skeleton associated with each input. Subsequently, skeleton heuristics are encoded into the prompt to incentivize GPT-3.5 to generate desired questions. Extensive experiments demonstrate that SGSH derives the new state-of-the-art performance on the KBQG tasks. The code is now available on Github.
Keywords
Knowledge base question generation, KBQR, Natural language processing, Skeleton Heuristics, Large language models, LLMs
Discipline
Artificial Intelligence and Robotics | Computer Sciences
Research Areas
Data Science and Engineering; Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2024) : Mexico City, Mexico, June 16-21
First Page
4613
Last Page
4625
Identifier
10.18653/v1/2024.findings-naacl.287
Publisher
Association for Computational Linguistics
City or Country
Mexico City, Mexico
Citation
GUO, Shasha; LIAO, Lizi; ZHANG, Jing; WANG, Yanling; LI, Cuiping; and CHEN, Hong.
SGSH : Stimulate Large Language Models with skeleton heuristics for knowledge base question generation. (2024). Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2024) : Mexico City, Mexico, June 16-21. 4613-4625.
Available at: https://ink.library.smu.edu.sg/sis_research/9702
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2024.findings-naacl.287