Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
8-2024
Abstract
Different from ordinary backdoors in neural networks which are introduced with artificial triggers (e.g., certain specific patch) and/or by tampering the samples, semantic backdoors are introduced by simply manipulating the semantic, e.g., by labeling green cars as frogs in the training set. By focusing on samples with rare semantic features (such as green cars), the accuracy of the model is often minimally affected. Since the attacker is not required to modify the input sample during training nor inference time, semantic backdoors are challenging to detect and remove. Existing backdoor detection and mitigation techniques are shown to be ineffective with respect to semantic backdoors. In this work, we propose a method to systematically detect and remove semantic backdoors. Specifically we propose SODA (Semantic BackdOor Detection and MitigAtion) with the key idea of conducting lightweight causality analysis to identify potential semantic backdoor based on how hidden neurons contribute to the predictions and to remove the backdoor by adjusting the responsible neurons’ contribution towards the correct predictions through optimization. SODA is evaluated with 21 neural networks trained on 6 benchmark datasets and 2 kinds of semantic backdoor attacks for each dataset. The results show that it effectively detects and removes semantic backdoors and preserves the accuracy of the neural networks.
Discipline
OS and Networks | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Areas of Excellence
Digital transformation
Publication
Proceedings of the 33rd USENIX Security Symposium, Philadelphia, USA, 2024 August 14-16
First Page
1
Last Page
18
Publisher
Usenix
City or Country
USA
Citation
SUN, Bing; SUN, Jun; KOH, Wayne; and SHI, Jie.
Neural network semantic backdoor detection and mitigation: A causality-based approach. (2024). Proceedings of the 33rd USENIX Security Symposium, Philadelphia, USA, 2024 August 14-16. 1-18.
Available at: https://ink.library.smu.edu.sg/sis_research/9211
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.