Publication Type

Conference Proceeding Article

Version

Publisher’s Version

Publication Date

5-2017

Abstract

The problem of fine-grained tweet geolocation is to link tweets to their posting venues. We solve this in a learning to rank framework by ranking candidate venues given a test tweet. The problem is challenging as tweets are short and the vast majority are non-geocoded, meaning information is sparse for building models. Nonetheless, although only a small fraction of tweets are geocoded, we find that they are posted by a substantial proportion of users. Essentially, such users have location history data. Along with tweet posting time, these serve as additional contextual information for geolocation. In designing our geolocation models, we also utilize the properties of (1) spatial focus where users are more likely to visit venues near each other and (2) spatial homophily where venues near each other tend to share more similar tweet content, compared to venues further apart. Our proposed model significantly outperforms the content-only approaches.

Discipline

Databases and Information Systems | Social Media | Theory and Algorithms

Research Areas

Data Science and Engineering

Publication

Proceedings of the 11th AAAI Conference on Web and Social Media ICWSM 2017: Montreal, Canada, May 15-18

First Page

488

Last Page

491

ISBN

9781577357889

Publisher

AAAI Press

City or Country

Menlo Park, CA

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL

https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15563

Share

COinS