Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2022

Abstract

Pre-trained models of code have achieved success in many important software engineering tasks. However, these powerful models are vulnerable to adversarial attacks that slightly perturb model inputs to make a victim model produce wrong outputs. Current works mainly attack models of code with examples that preserve operational program semantics but ignore a fundamental requirement for adversarial example generation: perturbations should be natural to human judges, which we refer to as naturalness requirement. In this paper, we propose ALERT (Naturalness Aware Attack), a black-box attack that adversarially transforms inputs to make victim models produce wrong outputs. Different from prior works, this paper considers the natural semantic of generated examples at the same time as preserving the operational semantic of original inputs. Our user study demonstrates that human developers consistently consider that adversarial examples generated by ALERT are more natural than those generated by the state-of-the-art work by Zhang et al. that ignores the naturalness requirement. On attacking CodeBERT, our approach can achieve attack success rates of 53.62%, 27.79%, and 35.78% across three downstream tasks: vulnerability prediction, clone detection and code authorship attribution. On GraphCodeBERT, our approach can achieve average success rates of 76.95%, 7.96% and 61.47% on the three tasks. The above outperforms the baseline by 14.07% and 18.56% on the two pretrained models on average. Finally, we investigated the value of the generated adversarial examples to harden victim models through an adversarial fine-tuning procedure and demonstrated the accuracy of CodeBERT and GraphCodeBERT against ALERT-generated adversarial examples increased by 87.59% and 92.32%, respectively

Keywords

Genetic Algorithm, Adversarial Attack, Pre-Trained Models

Discipline

Databases and Information Systems | Information Security

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 2022 May 21-29

First Page

1482

Last Page

1493

Identifier

10.1145/3510003.3510146

Publisher

Association for Computing Machinery

City or Country

New York

Citation

YANG, Zhou; SHI, Jieke; HE, Junda; and LO, David. Natural attack for pre-trained models of code. (2022). Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 2022 May 21-29. 1482-1493.
Available at: https://ink.library.smu.edu.sg/sis_research/7654

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3510003.3510146

Download

Included in

Databases and Information Systems Commons, Information Security Commons

COinS

Research Collection School Of Computing and Information Systems

Natural attack for pre-trained models of code

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Natural attack for pre-trained models of code

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links