We propose collective entity linking over tweets that are close in space and time. This exploits the fact that events or geographical points of interest often result in related entities being mentioned in spatio-temporal proximity. Our approach directly applies to geocoded tweets. Where geocoded tweets are overly sparse among all tweets, we use a relaxed version of spatial proximity which utilizes both geocoded and non-geocoded tweets linked by common mentions. Entity linking is affected by noisy mentions extracted and incomplete knowledge bases. Moreover, to perform evaluation on the entity linking results, much manual annotation of mentions is often required. To mitigate these challenges, we propose comparison-based evaluation, which assesses the change in linking quality when one linking method modifies the output of another. With this evaluation we show that differences between collective linking and local linking, i.e. linking entities in each tweet individually, are statistically significant. In extensive experiments, collective linking consistently yields more positive changes to the linking quality, than negative changes. The ratio of positive to negative changes varies from 1.44 to 12, depending on the experiment settings.
Entity disambiguation, Concept linking, Entity linking
Databases and Information Systems | Social Media | Software Engineering
Data Management and Analytics
European Conference on Information Retrieval: Advances in Information Retrieval
City or Country
CHONG, Wen Haw and LIM, Ee-peng.
Collective entity linking in tweets over space and time. (2017). European Conference on Information Retrieval: Advances in Information Retrieval. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/3720
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.