Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

8-2019

Abstract

As an important task in Sentiment Analysis, Target-oriented Sentiment Classification (TSC) aims to identify sentiment polarities over each opinion target in a sentence. However, existing approaches to this task primarily rely on the textual content, but ignoring the other increasingly popular multimodal data sources (e.g., images), which can enhance the robustness of these text-based models. Motivated by this observation and inspired by the recently proposed BERT architecture, we study Target-oriented Multimodal Sentiment Classification (TMSC) and propose a multimodal BERT architecture. To model intra-modality dynamics, we first apply BERT to obtain target-sensitive textual representations. We then borrow the idea from self-attention and design a target attention mechanism to perform target-image matching to derive target-sensitive visual representations. To model inter-modality dynamics, we further propose to stack a set of self-attention layers to capture multimodal interactions. Experimental results show that our model can outperform several highly competitive approaches for TSC and TMSC.

Keywords

Natural Language Processing, Sentiment Analysis and Text Mining

Discipline

Artificial Intelligence and Robotics | Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Intelligent Systems and Optimization

Publication

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

First Page

5408

Last Page

5414

ISBN

9780999241141

Identifier

10.24963/ijcai.2019/751

Publisher

IJCAI

Copyright Owner and License

Publisher

Additional URL

https://doi.org/10.24963/ijcai.2019/751

Share

COinS