Conference Proceeding Article
Social media have been popular not only for individuals to share contents, but also for organizations to engage users and spread information. Given the trait differences between personal and organization accounts, the ability to distinguish between the two account types is important for developing better search/recommendation engines, marketing strategies, and information dissemination platforms. However, such task is non-trivial and has not been well studied thus far. In this paper, we present a new generic framework for classifying personal and organization accounts, based upon which comprehensive and systematic investigation on a rich variety of content, social, and temporal features can be carried out. In addition to generic feature transformation pipelines, the framework features a gradient boosting classifier that is accurate/robust and facilitates good data understanding such as the importance of different features. We demonstrate the efficacy of our approach through extensive experiments on Twitter data from Singapore, by which we discover several discriminative content, social, and temporal features.
account type classification, gradient boosting, social media
Computer Sciences | Social Media
Data Management and Analytics
Advances in Information Retrieval: 37th European Conference on IR Research, ECIR 2015, Vienna, Austria, March 29 - April 2, 2015: Proceedings
City or Country
Oentaryo, Richard Jayadi; LOW, Jia-Wei; and LIM, Ee Peng.
Chalk and Cheese in Twitter: Discriminating Personal and Organization Accounts. (2015). Advances in Information Retrieval: 37th European Conference on IR Research, ECIR 2015, Vienna, Austria, March 29 - April 2, 2015: Proceedings. 9022, 465-476. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/2623