Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

9-2014

Abstract

User attribute extraction on social media has gain considerable attention, while existing methods are mostly supervised which suffer great diffi- culty in insufficient gold standard data. In this paper, we validate a strong hypothesis based on homophily and adapt it to ensure the certainty of user attribute we extracted via weakly supervised propagation. Homophily, the theory which states that people who are similar tend to become friends, has been well studied in the setting of online social networks. When we focus on age attribute, based on this theory, online friends tend to have similar age. In this work, we take a step further and study the hypothesis that the age gap between online friends become even smaller in a larger friendship clique. We empirically validate our hypothesis using two real social network data sets. We further design a propagation-based algorithm to predict online users’ age, leveraging the clique-based hypothesis. We find that our algorithm can outperform several baselines. We believe that this method could work as a way to enrich sparse data and the hypothesis we validated would shed light on exploring the proximity of other user attributes such as education as well.

Keywords

Social Network Analysis, Age Prediction, Homophily

Discipline

Databases and Information Systems | Social Media

Publication

HT'14: Proceedings of the 25th ACM Conference on Hypertext and Social Media: September 1-4, 2014, Santiago, Chile

First Page

98

Last Page

106

ISBN

9781450329545

Identifier

10.1145/2631775.2631800

Publisher

ACM

City or Country

New York

Copyright Owner and License

LARC

Additional URL

http://doi.org/10.1145/2631775.2631800

Share

COinS