Publication Type

Journal Article

Version

publishedVersion

Publication Date

5-2022

Abstract

Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate adversarial examples faster and stronger, we propose to learn the perturbations by a generative model that is governed by three novel loss functions. Image feature distortion loss is designed to maximize the encoded image feature distance between original images and the corresponding adversarial examples at the image domain, and local-global mismatching loss is introduced to separate the mapping encoding representation of the adversarial images and the ground true captions from a local and global perspective in the common semantic space as far as possible cross image and caption domain. Language diversity loss is to make the image captions generated by the adversarial examples as different as possible from the correct image caption at the language domain. Extensive experiments show that our proposed generative model can efficiently generate adversarial examples that successfully generalize to attack image captioning models trained on unseen large-scale datasets or with different architectures, or even the image captioning commercial service.

Keywords

Adversarial example, Generative model, Image caption, Image captioning, Image features, Learn+, Learning models, Neural-networks, Robustness of neural network, State of the art

Discipline

Databases and Information Systems | Theory and Algorithms

Research Areas

Information Systems and Management

Publication

ACM Transactions on Multimedia Computing, Communications and Applications

Volume

Issue

ISSN

1551-6857

Identifier

10.1145/3478024

Publisher

Association for Computing Machinery (ACM)

Citation

WU, Hanjie; LIU, Yongtuo; CAI, Hongmin; and HE, Shengfeng. Learning transferable perturbations for image captioning. (2022). ACM Transactions on Multimedia Computing, Communications and Applications. 18, (2),.
Available at: https://ink.library.smu.edu.sg/sis_research/8371

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3478024

Download

Included in

Databases and Information Systems Commons, Theory and Algorithms Commons

COinS

Research Collection School Of Computing and Information Systems

Learning transferable perturbations for image captioning

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Learning transferable perturbations for image captioning

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links