Publication Type

Conference Proceeding Article

Publication Date

4-2014

Abstract

Due to the fast development of social media on the Web, Twitter has become one of the major platforms for people to express themselves. Because of the wide adoption of Twitter, events like breaking news and release of popular videos can easily catch people’s attention and spread rapidly on Twitter, and the number of relevant tweets approximately reflects the impact of an event. Event identification and analysis on Twitter has thus become an important task. Recently the Recurrent Chinese Restaurant Process (RCRP) has been successfully used for event identification from news streams and news-centric social media streams. However, these models cannot be directly applied to Twitter based on our preliminary experiments mainly for two reasons: (1) Events emerge and die out fast on Twitter, while existing models ignore this burstiness property. (2) Most Twitter posts are personal interest oriented while only a small fraction is event related. Motivated by these challenges, we propose a new nonparametric model which considers burstiness. We further combine this model with traditional topic models to identify both events and topics simultaneously. Our quantitative evaluation provides sufficient evidence that our model can accurately detect meaningful events. Our qualitative evaluation also shows interesting analysis for events on Twitter.

Discipline

Computer Sciences | Databases and Information Systems | Social Media

Research Areas

Data Management and Analytics

Publication

Proceedings of the 2014 SIAM International Conference on Data Mining: April 24-26, Philadelphia, PA

First Page

388

Last Page

397

ISBN

9781611973440

Identifier

10.1137/1.9781611973440.45

Publisher

SIAM

City or Country

Philadelphia, PA

Additional URL

http://dx.doi.org/10.1137/1.9781611973440.45

Share

COinS