Publication Type

Conference Proceeding Article

Version

Postprint

Publication Date

4-2011

Abstract

Twitter as a new form of social media can potentially contain much useful information, but content analysis on Twitter has not been well studied. In particular, it is not clear whether as an information source Twitter can be simply regarded as a faster news feed that covers mostly the same information as traditional news media. In This paper we empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling. We use a Twitter-LDA model to discover topics from a representative sample of the entire Twitter. We then use text mining techniques to compare these Twitter topics with topics from New York Times, taking into consideration topic categories and types. We also study the relation between the proportions of opinionated tweets and retweets and topic categories and types. Our comparisons show interesting and useful findings for downstream IR or DM applications.

Keywords

Twitter, microblogging, topic modeling

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Management and Analytics

Publication

European Conference on Information Retrieval (ECIR) 33rd, Dublim, 18-21 April

First Page

338

Last Page

349

ISBN

9783642201608

Identifier

10.1007/978-3-642-20161-5_34

Publisher

Springer Verlag

City or Country

Dublin, Ireland

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

http://dx.doi.org/10.1007/978-3-642-20161-5_34

Share

COinS