Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
12-2023
Abstract
Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks. However, their effectiveness is highly dependent on the phrasing of the task prompt, leading to research on automatic prompt optimization using labeled task data. We reveal that these prompt optimization techniques are vulnerable to distribution shifts such as subpopulation shifts, which are common for LLMs in real-world scenarios such as customer reviews analysis. In this light, we propose a new problem of robust prompt optimization for LLMs against distribution shifts, which requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group. To solve this problem, we propose Generalized Prompt Optimization framework , which incorporates the unlabeled data from the target group into prompt optimization. Extensive experimental results demonstrate the effectiveness of the proposed framework with significant performance improvement on the target group and comparable performance on the source group.
Keywords
Language model, Language processing, Natural language, Optimizations, Optimization framework
Discipline
Artificial Intelligence and Robotics | Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
2023 Conference on Empirical Methods in Natural Language Processing: Singapore, December 6-10: Proceedings
First Page
1539
Last Page
1554
ISBN
9798891760608
Identifier
10.18653/v1/2023.emnlp-main.95
Publisher
Association for Computational Linguistics
City or Country
Stroudsburg, PA
Citation
LI, Moxin; WANG, Wenjie; FENG, Fuli; CAO, Yixin; ZHANG, Jizhi; and CHUA, Tat-Seng.
Robust prompt optimization for large language models against distribution shifts. (2023). 2023 Conference on Empirical Methods in Natural Language Processing: Singapore, December 6-10: Proceedings. 1539-1554.
Available at: https://ink.library.smu.edu.sg/sis_research/8393
Copyright Owner and License
Publisher
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2023.emnlp-main.95