Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2025
Abstract
Quantizing large language models (LLMs) is essential for reducing memory and computational costs in natural language processing. Existing methods combine quantization with parameter-efficient fine-tuning but often fail to meet practical performance requirements. This paper introduces MeMoTune, a novel fine-tuning framework for quantized LLMs. By employing a measure and moment approach within a low-rank approximation framework in probability measure space, MeMoTune optimizes the objective function for superior fine-tuning results. The update process is further refined through scaled gradient, enhancing convergence efficiency and noise robustness. Experiments on tasks like text generation, summarization, and understanding show MeMoTune significantly outperforms state-of-the-art methods, e.g. fine-tuning Llama2-13B on GSM8K improves accuracy by 5.5%, while fine-tuning DeBERTaV3-base on CoLA of GLUE increases Matthews correlation by 1.7%. The code is publicly available at: https://github.com/hddyyyb/MeMoTune.
Discipline
Artificial Intelligence and Robotics | Programming Languages and Compilers
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), Vienna, Austria, July 27 - August 1
First Page
4036
Last Page
4050
Identifier
10.18653/v1/2025.findings-acl.208
Publisher
ACL
City or Country
Austria
Citation
ZHANG, Yun; GENG, Xue; LIAO, Lizi; SUN, Jintong; YU, Minghe; and YU, Ge.
MeMoTune: A measure and moment-driven fine-tuning framework for quantized large language models. (2025). Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), Vienna, Austria, July 27 - August 1. 4036-4050.
Available at: https://ink.library.smu.edu.sg/sis_research/10760
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.18653/v1/2025.findings-acl.208
Included in
Artificial Intelligence and Robotics Commons, Programming Languages and Compilers Commons