Publication Type
Journal Article
Version
publishedVersion
Publication Date
9-2017
Abstract
Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter’s tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed ‘Joint’ ranking method is able to take advantage of the strengths of the ranking methods. This ‘Joint’ ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems.
Keywords
topic identification, multilingual analysis, unsupervised learning, social media
Discipline
Computer Engineering | Social Media
Research Areas
Data Science and Engineering
Publication
Expert Systems with Applications
Volume
81
First Page
282
Last Page
298
ISSN
0957-4174
Identifier
10.1016/j.eswa.2017.03.029
Publisher
Elsevier
Citation
LO, Siaw Ling; CHIONG, Raymond; and CORNFORTH, David.
An unsupervised multilingual approach for online social media topic identification. (2017). Expert Systems with Applications. 81, 282-298.
Available at: https://ink.library.smu.edu.sg/sis_research/4873
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1016/j.eswa.2017.03.029