Publication Type
Journal Article
Version
publishedVersion
Publication Date
9-2025
Abstract
Large Language Models (LLMs) have been widely adopted by developers in software development. However, the massive pretraining code data is not rigorously filtered, allowing LLMs to learn unsafe coding patterns. Several prior studies have demonstrated that code LLMs tend to generate code with potential vulnerabilities. The widespread adoption of intelligent programming assistants poses a significant threat to the software development process. Existing approaches to mitigating this risk primarily involve constructing secure data that are free of vulnerabilities and then retraining or fine-tuning the models. However, such an effort is resource intensive and requires significant manual supervision. When the model parameters are too large (e.g., more than 1 billion) or multiple models with the same parameter scale have the same optimization needs (e.g., to avoid outputting vulnerable code), the above work will become unaffordable. To address this challenge, in previous work, we proposed CoSec, an approach to improve the security of code LLMs with different parameters by utilizing an independent and very small parametric security model as a decoding navigator. Despite CoSec’s excellent performance, we found that there is still room for improving: 1) its ability to maintain the functional correctness of hardened targets, and 2) the security of the generated code. To address the above issues, we propose CoSec+, a hardening framework consisting of three phases: 1) Functional Correctness Alignment, which improves the functional correctness of the security base with knowledge disstillation; 2) Security Training, which yields an independent, but much smaller security model; and 3) Co-decoding, where the security model iteratively reasons about the next token along with the target model. Due to the higher confidence that a well-trained security model places in secure and correct tokens, it guides the target base model to generate more secure code, even as it improves the functional correctness of the target base model. We have conducted extensive experiments in several code LLMs (i.e., CodeGen, StarCoderBase, DeepSeekCoder and Qwen2.5-Coder), and the results show that our approach is effective in improving the functional correctness and security of the models. The evaluation results show that CoSec+ can deliver a 0.8% to 37.7% improvement in security across models of various parameter sizes and families; moreover, it preserves the functional correctness of the target base models—achieving functional-correctness gains of 0.7% to 51.1% for most of those models.
Keywords
Security, Codes, Training, Software Development Management, Predictive Models, Computational Modeling, Maintenance, Pipelines, Optimization
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Areas of Excellence
Digital transformation
Publication
IEEE Transactions on Software Engineering
Volume
51
Issue
9
First Page
2634
Last Page
2650
ISSN
0098-5589
Identifier
10.1109/TSE.2025.3591791
Publisher
Institute of Electrical and Electronics Engineers
Citation
LI, Dong; SHU, Shanfu; YAN, Meng; LIU, Zhongxin; LIU, Chao; ZHANG, Xiaohong; and LO, David.
Improving co-decoding based security hardening of code LLMs leveraging knowledge distillation. (2025). IEEE Transactions on Software Engineering. 51, (9), 2634-2650.
Available at: https://ink.library.smu.edu.sg/sis_research/10944
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSE.2025.3591791