Research Collection School Of Computing and Information Systems

Neural network semantic backdoor detection and mitigation: A causality-based approach

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

8-2024

Abstract

Different from ordinary backdoors in neural networks which are introduced with artificial triggers (e.g., certain specific patch) and/or by tampering the samples, semantic backdoors are introduced by simply manipulating the semantic, e.g., by labeling green cars as frogs in the training set. By focusing on samples with rare semantic features (such as green cars), the accuracy of the model is often minimally affected. Since the attacker is not required to modify the input sample during training nor inference time, semantic backdoors are challenging to detect and remove. Existing backdoor detection and mitigation techniques are shown to be ineffective with respect to semantic backdoors. In this work, we propose a method to systematically detect and remove semantic backdoors. Specifically we propose SODA (Semantic BackdOor Detection and MitigAtion) with the key idea of conducting lightweight causality analysis to identify potential semantic backdoor based on how hidden neurons contribute to the predictions and to remove the backdoor by adjusting the responsible neurons’ contribution towards the correct predictions through optimization. SODA is evaluated with 21 neural networks trained on 6 benchmark datasets and 2 kinds of semantic backdoor attacks for each dataset. The results show that it effectively detects and removes semantic backdoors and preserves the accuracy of the neural networks.

Discipline

OS and Networks | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Areas of Excellence

Digital transformation

Publication

Proceedings of the 33rd USENIX Security Symposium, Philadelphia, USA, 2024 August 14-16

First Page

Last Page

Publisher

Usenix

City or Country

USA

Citation

SUN, Bing; SUN, Jun; KOH, Wayne; and SHI, Jie. Neural network semantic backdoor detection and mitigation: A causality-based approach. (2024). Proceedings of the 33rd USENIX Security Symposium, Philadelphia, USA, 2024 August 14-16. 1-18.
Available at: https://ink.library.smu.edu.sg/sis_research/9211

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

OS and Networks Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Neural network semantic backdoor detection and mitigation: A causality-based approach

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

Publisher

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Neural network semantic backdoor detection and mitigation: A causality-based approach

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

Publisher

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links