Distributional black-box model inversion attack with multi-agent reinforcement learning
Publication Type
Journal Article
Publication Date
4-2025
Abstract
Model Inversion (MI) attacks based on Generative Adversarial Networks (GAN) aim to recover private training data from complex deep learning models by searching codes in the latent space. However, this method merely searches in a deterministic latent space, resulting in suboptimal latent codes. Additionally, existing distributional MI schemes assume that an attacker can access the structures and parameters of the target model, which is not always feasible in practice. To address these limitations, this paper proposes a novel Distributional Black-Box Model Inversion (DBB-MI) attack by constructing a probabilistic latent space for searching private data. Specifically, DBB-MI does not require the target model’s parameters or specialized GAN training. Instead, it identifies the latent probability distribution by integrating the output of the target model with multi-agent reinforcement learning techniques. Then, it randomly selects latent codes from the latent probability distribution to uncover private data. As the latent probability distribution closely mirrors the target privacy data in the latent space, the recovered data effectively leaks the privacy of the target model’s training samples. Extensive experiments conducted on diverse datasets and networks demonstrate that our DBB-MI outperforms state-of-the-art MI attacks in terms of attack accuracy, K-nearest neighbor feature distance, and peak signal-to-noise ratio.
Keywords
Distributional model inversion (MI) attack, deep learning, multi-agent reinforcement learning (MARL), black-box attack
Discipline
Information Security
Research Areas
Information Systems and Management
Publication
IEEE Transactions on Information Forensics and Security
Volume
20
Issue
1
First Page
5425
Last Page
5437
ISSN
1556-6013
Identifier
10.1109/TIFS.2025.3564043
Publisher
Institute of Electrical and Electronics Engineers
Citation
BAO, Huan; WEI, Kaimin; WU, Yongdong; QIAN, Jin; and DENG, Robert H..
Distributional black-box model inversion attack with multi-agent reinforcement learning. (2025). IEEE Transactions on Information Forensics and Security. 20, (1), 5425-5437.
Available at: https://ink.library.smu.edu.sg/sis_research/10450
Additional URL
https://doi.org/10.1109/TIFS.2025.3564043