Publication Type
Journal Article
Version
acceptedVersion
Publication Date
1-2022
Abstract
Multimodal sentiment analysis aims to recognize people's attitudes from multiple communication channels such as verbal content (i.e., text), voice, and facial expressions. It has become a vibrant and important research topic in natural language processing. Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels. However, current multimodal models with strong performance are often deep-learning-based techniques and work like black boxes. It is not clear how models utilize multimodal information for sentiment predictions. Despite recent advances in techniques for enhancing the explainability of machine learning models, they often target unimodal scenarios (e.g., images, sentences), and little research has been done on explaining multimodal models. In this paper, we present an interactive visual analytics system, M2 Lens, to visualize and explain multimodal models for sentiment analysis. M2 Lens provides explanations on intra- and inter-modal interactions at the global, subset, and local levels. Specifically, it summarizes the influence of three typical interaction types (i.e., dominance, complement, and conflict) on the model predictions. Moreover, M2 Lens identifies frequent and influential multimodal features and supports the multi-faceted exploration of model behaviors from language, acoustic, and visual modalities. Through two case studies and expert interviews, we demonstrate our system can help users gain deep insights into the multimodal models for sentiment analysis.
Keywords
Multimodal models, sentiment analysis, explainable machine learning
Discipline
Databases and Information Systems | Numerical Analysis and Scientific Computing
Research Areas
Data Science and Engineering
Publication
IEEE Transactions on Visualization and Computer Graphics
Volume
28
Issue
1
First Page
802
Last Page
812
ISSN
1077-2626
Identifier
10.1109/TVCG.2021.3114794
Publisher
Institute of Electrical and Electronics Engineers
Citation
WANG, Xingbo; HE, Jianben; JIN, Zhihua; YANG, Muqiao; WANG, Yong; and QU, Huamin.
M2Lens: Visualizing and explaining multimodal models for sentiment analysis. (2022). IEEE Transactions on Visualization and Computer Graphics. 28, (1), 802-812.
Available at: https://ink.library.smu.edu.sg/sis_research/6777
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TVCG.2021.3114794
Included in
Databases and Information Systems Commons, Numerical Analysis and Scientific Computing Commons