Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
12-2021
Abstract
Entity matching across two data sources is a prevalent need in many domains, including e-commerce. Of interest is the scenario where entities have varying granularity, e.g., a coarse product category may match multiple finer categories. Previous work in one-to-many matching generally presumes the `one' necessarily comes from a designated source and the `many' from the other source. In contrast, we propose a novel formulation that allows concurrent one-to-many bidirectional matching in any direction. Beyond flexibility, we also seek matching that is more robust to noisy similarity values arising from diverse entity descriptions, by introducing receptivity and reclusivity notions. In addition to an optimal formulation, we also propose an efficient and performant heuristic. Experiments on multiple real-life datasets from e-commerce sources showcase the effectiveness and outperformance of our proposed algorithms over baselines.
Keywords
entity resolution, matching, one-to-many, poly, bipoly
Discipline
Databases and Information Systems | Data Science
Research Areas
Data Science and Engineering
Publication
2021 IEEE International Conference on Data Mining ICDM: Auckland, Virtual, December 7-10: Proceedings
First Page
1192
Last Page
1197
ISBN
9781665423984
Identifier
10.1109/ICDM51629.2021.00143
Publisher
IEEE
City or Country
Piscataway, NJ
Embargo Period
12-13-2021
Citation
LEE, Ween Jiann; TKACHENKO, Maksim; and LAUW, Hady W..
Robust bipoly-matching for multi-granular entities. (2021). 2021 IEEE International Conference on Data Mining ICDM: Auckland, Virtual, December 7-10: Proceedings. 1192-1197.
Available at: https://ink.library.smu.edu.sg/sis_research/6434
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICDM51629.2021.00143