Publication Type

PhD Dissertation

Version

publishedVersion

Publication Date

8-2014

Abstract

Opinions are central to almost all human activities by influencing greatly the decision making process. In this thesis, we present the problems of mining issues, extracting entities and suggestive opinions towards the entities, detecting thoughtful comments, and extracting stances and ideological expressions from online comments in the sociopolitical domain. This study is essential for opinion mining applications that are beneficial for policy makers, government sectors and social organizations. Much work has been done to try to uncover consumer sentiments from online comments to help businesses improve their products and services. However, sociopolitical opinion mining poses new challenges due to complex topic and sentiment expressions. We first present the problem of issue extraction from sociopolitical comments for which we propose an unsupervised approach based on latent variable methods for identifying and extracting the issues in the comments, and linking comments to the issues in the associated article. We evaluate our approach on political speeches and associated comments from social media. In the sociopolitical domain, users express their sentiments on the entities such as individuals or organizations. These sentiments are not only in the form of positive and negative expressions, but also in the form of suggestive opinions towards the entities. We present a new problem of extracting the entities and associated suggestive opinions. We propose a two-stage approach based on conditional random fields (CRF) and clustering for extracting and normalizing the entities and the associated suggestive opinions from the users. A key feature of social media is that it enables anyone to freely express his/her opinions. As a result of the large amount of online comments, there is an urge for extracting opinions which are highly valuable. In terms of thoughtful comment extraction, we study the task of extracting valuable comments from social media. We propose a supervised approach based on natural language processing and linguistics techniques to identify and extract valuable comments in the sociopolitical domain from social media. Users take positions/stances and express opinions towards controversial sociopolitical issues. We present the problem of extracting the topics, stances, and ideological expressions of users from their comments on ideological debates related to sociopolitical domain. We propose an unsupervised approach based on latent variable methods and evaluate on Debatepedia for identifying and extracting the positional words and entities associated with the issues. In summary, this thesis identifies a number of key problems in mining sociopolitical comments and proposes appropriate solutions to these problems.

Keywords

opinion mining, topic models, social media, natural language processing, sociopolitical data, text mining

Degree Awarded

PhD in Information Systems

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing | Social Media

Supervisor(s)

JIANG, Jing; LIM, Ee-Peng

First Page

1

Last Page

148

Publisher

Singapore Management University

City or Country

Singapore

Copyright Owner and License

Author

Share

COinS