Research Collection School Of Computing and Information Systems

Defending Code Language Models against backdoor attacks with deceptive cross-entropy loss

Publication Type

Journal Article

Version

publishedVersion

Publication Date

2-2026

Abstract

Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defense methods from natural language processing, when directly applied to CLMs, are not effective enough and lack generality, working well in some models and scenarios but failing in others, thus fall short in consistently mitigating backdoor attacks. To bridge this gap, we first confirm the phenomenon of "early learning" as a general occurrence during the training of CLMs. This phenomenon refers to that a model initially focuses on the main features of training data but may become more sensitive to backdoor triggers over time, leading to overfitting and susceptibility to backdoor attacks. We then analyze that overfitting to backdoor triggers results from the use of the cross-entropy loss function, where the unboundedness of cross-entropy leads the model to increasingly concentrate on the features of the poisoned data. Based on this insight, we propose a general and effective loss function DeCE (Deceptive Cross-Entropy) by blending deceptive distributions and applying label smoothing to limit the gradient to bounded, which prevents the model from overfitting to backdoor triggers and then enhances the security of CLMs against backdoor attacks. To evaluate the effectiveness of our defense method, we select four code-related tasks as our experimental scenes and conduct experimental analyses on both natural language and two programming languages (Java and Python). Our experiments across multiple models with different sizes (from 125 millions to 7 billions) and poisoning ratios demonstrate the applicability and effectiveness of DeCE in enhancing the security of CLMs. The findings emphasize the potential of DeCE as a novel defense mechanism for CLMs, effectively tackling the challenge of securing models against backdoor threats.

Keywords

Large Language Models, Backdoor Defense, Early Learning, Code Generation, Security

Discipline

Information Security | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ACM Transactions on Software Engineering and Methodology

Volume

Issue

First Page

Last Page

ISSN

1049-331X

Identifier

10.1145/3728639

Publisher

Association for Computing Machinery (ACM)

Citation

YANG, Guang; ZHOU, Yu; ZHANG, Xiangyu; CHEN, Xiang; ZHUO, Terry Yue; LO, David; and CHEN, Taolue. Defending Code Language Models against backdoor attacks with deceptive cross-entropy loss. (2026). ACM Transactions on Software Engineering and Methodology. 35, (2), 1-27.
Available at: https://ink.library.smu.edu.sg/sis_research/11021

Copyright Owner and License

Authors-CC-BY

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3728639

Download

Included in

Information Security Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Defending Code Language Models against backdoor attacks with deceptive cross-entropy loss

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Defending Code Language Models against backdoor attacks with deceptive cross-entropy loss

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links