Conference Proceeding Article
Due to the fast development of social media on the Web, Twitter has become one of the major platforms for people to express themselves. Because of the wide adoption of Twitter, events like breaking news and release of popular videos can easily catch people’s attention and spread rapidly on Twitter, and the number of relevant tweets approximately reflects the impact of an event. Event identification and analysis on Twitter has thus become an important task. Recently the Recurrent Chinese Restaurant Process (RCRP) has been successfully used for event identification from news streams and news-centric social media streams. However, these models cannot be directly applied to Twitter based on our preliminary experiments mainly for two reasons: (1) Events emerge and die out fast on Twitter, while existing models ignore this burstiness property. (2) Most Twitter posts are personal interest oriented while only a small fraction is event related. Motivated by these challenges, we propose a new nonparametric model which considers burstiness. We further combine this model with traditional topic models to identify both events and topics simultaneously. Our quantitative evaluation provides sufficient evidence that our model can accurately detect meaningful events. Our qualitative evaluation also shows interesting analysis for events on Twitter.
Computer Sciences | Databases and Information Systems | Social Media
Data Management and Analytics
Proceedings of the 2014 SIAM International Conference on Data Mining: April 24-26, Philadelphia, PA
City or Country
DIAO, Qiming and JIANG, Jing.
Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter. (2014). Proceedings of the 2014 SIAM International Conference on Data Mining: April 24-26, Philadelphia, PA. 388-397. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/2412