Publication Type
Journal Article
Version
publishedVersion
Publication Date
4-2015
Abstract
The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space.
Keywords
Blogging, Data Mining, Marketing, Social Media, Support Vector Machine
Discipline
Computer Engineering | Numerical Analysis and Scientific Computing | Social Media
Research Areas
Data Science and Engineering
Publication
PLoS ONE
Volume
10
Issue
4
First Page
1
Last Page
20
ISSN
1932-6203
Identifier
10.1371/journal.pone.0122855
Publisher
Public Library of Science
Embargo Period
4-25-2021
Citation
LO, Siaw Ling; CHIONG, Raymond; and CORNFORTH, David.
Using support vector machine ensembles for target audience classification on Twitter. (2015). PLoS ONE. 10, (4), 1-20.
Available at: https://ink.library.smu.edu.sg/sis_research/5906
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Additional URL
https://doi.org/10.1371/journal.pone.0122855
Included in
Computer Engineering Commons, Numerical Analysis and Scientific Computing Commons, Social Media Commons