Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2024

Abstract

The age of social media is flooded with Internet memes, necessitating a clear grasp and effective identification of harmful ones. This task presents a significant challenge due to the implicit meaning embedded in memes, which is not explicitly conveyed through the surface text and image. However, existing harmful meme detection methods do not present readable explanations that unveil such implicit meaning to support their detection decisions. In this paper, we propose an explainable approach to detect harmful memes, achieved through reasoning over conflicting rationales from both harmless and harmful positions. Specifically, inspired by the powerful capacity of Large Language Models (LLMs) on text generation and reasoning, we first elicit multimodal debate between LLMs to generate the explanations derived from the contradictory arguments. Then we propose to fine-tune a small language model as the debate judge for harmfulness inference, to facilitate multimodal fusion between the harmfulness rationales and the intrinsic multimodal information within memes. In this way, our model is empowered to perform dialectical reasoning over intricate and implicit harm-indicative patterns, utilizing multimodal explanations originating from both harmless and harmful arguments. Extensive experiments on three public meme datasets demonstrate that our harmful meme detection approach achieves much better performance than state-of-the-art methods and exhibits a superior capacity for explaining the meme harmfulness of the model predictions.

Keywords

harmful meme detection, explainability, multimodal debate, LLMs

Discipline

Artificial Intelligence and Robotics | Numerical Analysis and Scientific Computing | Social Media

Research Areas

Data Science and Engineering

Publication

WWW '24: Proceedings of the ACM Web Conference 2024, Singapore, May 13-17

First Page

2359

Last Page

2370

ISBN

9798400701719

Identifier

10.1145/3589334.3645381

Publisher

ACM

City or Country

New York

Citation

LIN, Hongzhan; LUO, Ziyang; GAO, Wei; MA, Jing; WANG, Bo; and YANG, Ruichao. Towards explainable harmful meme detection through multimodal debate between Large Language Models. (2024). WWW '24: Proceedings of the ACM Web Conference 2024, Singapore, May 13-17. 2359-2370.
Available at: https://ink.library.smu.edu.sg/sis_research/9324

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3589334.3645381

Download

Included in

Artificial Intelligence and Robotics Commons, Numerical Analysis and Scientific Computing Commons, Social Media Commons

COinS

Research Collection School Of Computing and Information Systems

Towards explainable harmful meme detection through multimodal debate between Large Language Models

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Towards explainable harmful meme detection through multimodal debate between Large Language Models

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links