Publication Type

Conference Proceeding Article

Publication Date

4-2017

Abstract

Topic modeling has traditionally been studied for single text collections and applied to social media data represented in the form of text documents. With the emergence of many social media platforms, users find themselves using different social media for posting content and for social interaction. While many topics may be shared across social media platforms, users typically show preferences of certain social media platform(s) over others for certain topics. Such platform preferences may even be found at the individual level. To model social media topics as well as platform preferences of users, we propose a new topic model known as MultiPlatform-LDA (MultiLDA). Instead of just merging all posts from different social media platforms into a single text collection, MultiLDA keeps one text collection for each social media platform but allowing these platforms to share a common set of topics. MultiLDA further learns the user-specific platform preferences for each topic. We evaluate MultiLDA against TwitterLDA, the state-of-the-art method for social media content modeling, on two aspects: (i) the effectiveness in modeling topics across social media platforms, and (ii) the ability to predict platform choices for each post. We conduct experiments on three real-world datasets from Twitter, Instagram and Tumblr sharing a set of common users. Our experiments results show that the MultiLDA outperforms in both topic modeling and platform choice prediction tasks. We also show empirically that among the three social media platforms, "Daily matters" and "Relationship matters" are dominant topics in Twitter, "Social gathering", "Outing" and "Fashion" are dominant topics in Instagram, and "Music", "Entertainment" and "Fashion" are dominant topics in Tumblr.

Keywords

user preference, multiple social networks, topic modeling

Discipline

Databases and Information Systems | Social Media | Theory and Algorithms

Research Areas

Data Management and Analytics

Publication

WWW '17 Proceedings of the 26th International Conference on World Wide Web

First Page

1351

Last Page

1359

ISBN

978-1-4503-4913-0

Identifier

10.1145/3038912.3052614

Publisher

ACM

City or Country

New York, USA

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://doi.org/10.1145/3038912.3052614

Share

COinS