Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
9-2022
Abstract
Entity alignment (EA) aims to find equivalent entities in different knowledge graphs (KGs). Current EA approaches suffer from scalability issues, limiting their usage in real-world EA scenarios. To tackle this challenge, we propose LargeEA to align entities between large-scale KGs. LargeEA consists of two channels, i.e., structure channel and name channel. For the structure channel, we present METIS-CPS, a memory-saving mini-batch generation strategy, to partition large KGs into smaller mini-batches. LargeEA, designed as a general tool, can adopt any existing EA approach to learn entities’ structural features within each mini-batch independently. For the name channel, we first introduce NFF, a name feature fusion method, to capture rich name features of entities without involving any complex training process; we then exploit a name-based data augmentation to generate seed alignment without any human intervention. Such design fits common real-world scenarios much better, as seed alignment is not always available. Finally, LargeEA derives the EA results by fusing the structural features and name features of entities. Since no widely-acknowledged benchmark is available for large-scale EA evaluation, we also develop a largescale EA benchmark called DBP1M extracted from real-world KGs. Extensive experiments confirm the superiority of LargeEA against state-of-the-art competitors.
Discipline
Graphics and Human Computer Interfaces
Research Areas
Data Science and Engineering
Publication
Proceedings of the the 48th International Conference on Very Large Databases, Sydney, Australia, 2022 September 5-9
Volume
15
First Page
233
Last Page
245
Identifier
10.14778/3489496.3489504
Publisher
ACM
City or Country
Australia
Citation
GE, Congcong; LIU, Xiaoze; CHEN, Lu; GAO, Yunjun; and ZHENG, Baihua.
LargeEA: Aligning entities for large-scale knowledge graphs. (2022). Proceedings of the the 48th International Conference on Very Large Databases, Sydney, Australia, 2022 September 5-9. 15, 233-245.
Available at: https://ink.library.smu.edu.sg/sis_research/7178
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.