Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
11-2017
Abstract
Data fusion is a fundamental research problem of identifying true values of data items of interest from conflicting multi-sourced data. Although considerable research efforts have been conducted on this topic, existing approaches generally assume every data item has exactly one true value, which fails to reflect the real world where data items with multiple true values widely exist. In this paper, we propose a novel approach,SourceVote, to estimate value veracity for multi-valued data items. SourceVote models the endorsement relations among sources by quantifying their two-sided inter-source agreements. In particular, two graphs are constructed to model inter-source relations. Then two aspects of source reliability are derived from these graphs and are used for estimating value veracity and initializing existing data fusion methods. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.
Keywords
Data integration, Data fusion, Multi-valued data items, Inter-source agreements
Discipline
Databases and Information Systems | Data Storage Systems
Publication
Conceptual modeling: ER 2017: 36th International Conference, Valencia, Spain, November 6-9: Proceedings
Volume
10650
First Page
164
Last Page
172
ISBN
9783319699042
Identifier
10.1007/978-3-319-69904-2_13
Publisher
Springer
City or Country
Cham
Citation
FANG, Xiu Susie; SHENG, Quan Z.; WANG, Xianzhi; BARHAMGI, Mahmoud; YAO, Lina; and NGU, Anne H.H..
SourceVote: Fusing multi-valued data via inter-source agreements. (2017). Conceptual modeling: ER 2017: 36th International Conference, Valencia, Spain, November 6-9: Proceedings. 10650, 164-172.
Available at: https://ink.library.smu.edu.sg/sis_research/3941
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/978-3-319-69904-2_13