Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

8-2024

Abstract

Fine-tuning all parameters of large language models (LLMs) necessitates substantial computational power and extended time. Latest advancements in parameter-efficient fine-tuning (PEFT) techniques, such as Adapter tuning and LoRA, allow for adjustments to only a minor fraction of the parameters of these LLMs. Concurrently, it has been noted that the issue of over-smoothing diminishes the effectiveness of these Transformer-based LLMs, resulting in suboptimal performances in downstream tasks. In this paper, we present SIBO, which is a SImple BOoster to enhance PEFT, by injecting an initial residual. SIBO is straightforward and readily extensible to a range of state-of-the-art PEFT techniques to alleviate over-smoothing and enhance performance. Extensive experiments on 22 benchmark datasets demonstrate that SIBO significantly enhances the performance of various strong baselines, achieving up to 15.7% and 23.5% improvement over existing PEFT methods on the arithmetic and commonsense reasoning tasks, respectively.

Keywords

Large language models, LLMs, Parameter-efficient fine-tuning

Discipline

Artificial Intelligence and Robotics | Computer Sciences

Research Areas

Data Science and Engineering; Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) : Bangkok, Thailand, August 11-16

First Page

1241

Last Page

1257

Identifier

10.18653/v1/2024.findings-acl.72

Publisher

Association for Computational Linguistics

City or Country

Bangkok, Thailand

Citation

WEN, Zhihao; ZHANG, Jie; and FANG, Yuan. SIBO : A simple booster for parameter-efficient fine-tuning. (2024). 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) : Bangkok, Thailand, August 11-16. 1241-1257.
Available at: https://ink.library.smu.edu.sg/sis_research/9624

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.18653/v1/2024.findings-acl.72

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Research Collection School Of Computing and Information Systems

SIBO : A simple booster for parameter-efficient fine-tuning

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

SIBO : A simple booster for parameter-efficient fine-tuning

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links