Publication Type

Journal Article

Version

acceptedVersion

Publication Date

10-2023

Abstract

A fundamental problem in many scenarios is to match entities across two data sources. It is frequently presumed in prior work that entities to be matched are of comparable granularity. In this work, we address one-to-many or poly-matching in the scenario where entities have varying granularity. A distinctive feature of our problem is its bidirectional nature, where the 'one' or the 'many' could come from either source arbitrarily. Moreover, to deal with diverse entity representations that give rise to noisy similarity values, we incorporate novel notions of receptivity and reclusivity into a robust matching objective. As the optimal solution to the resulting formulation is proven computationally intractable, we propose more scalable yet still performant heuristics. Experiments on multiple real-life datasets showcase the effectiveness and outperformance of our proposed algorithms over baselines.

Keywords

Lenses, Cameras, Noise measurement, Matched filters, Soft sensors, Mathematical models, Linear programming, Entity resolution, matching, one-to-many

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Publication

IEEE Transactions on Knowledge and Data Engineering

Volume

35

Issue

10

First Page

10762

Last Page

10774

ISSN

1041-4347

Identifier

10.1109/TKDE.2023.3266480

Publisher

Institute of Electrical and Electronics Engineers

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1109/TKDE.2023.3266480

Share

COinS