Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

12-2014

Abstract

Twitter, one of the most popular social media platforms, has been studied from different angles. One of the important sources of information in Twitter is users’ biographies, which are short self-introductions written by users in free form. Biographies often describe users’ background and interests. However, to the best of our knowledge, there has not been much work trying to extract information from Twitter biographies. In this work, we study how to extract information revealing users’ personal interests from Twitter biographies. A sequential labeling model is trained with automatically constructed labeled data. The popular patterns expressing user interests are extracted and analyzed. We also study the connection between interest tags extracted from user biographies and tweet content, and find that there is a weak linkage between them, suggesting that biographies can potentially serve as a complimentary source of information to tweets.

Keywords

Extract information, Freeforms, Labeled data, Social media platforms, Sources of information, User interests

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing | Social Media

Research Areas

Data Science and Engineering

Publication

Information Retrieval Technology: 10th Asia Information Retrieval Societies Conference, AIRS 2014, Kuching, Malaysia, December 3-5, 2014: Proceedings

Volume

8870

First Page

268

Last Page

279

ISBN

9783319128436

Identifier

10.1007/978-3-319-12844-3_23

Publisher

Springer

City or Country

Cham

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1007/978-3-319-12844-3_23

Share

COinS