Publication Type
Conference Paper
Version
submittedVersion
Publication Date
11-2017
Abstract
Data fusion is a fundamental research problem of identifyingtrue values of data items of interest from conflicting multi-sourceddata. Although considerable research efforts have been conducted on thistopic, existing approaches generally assume every data item has exactlyone true value, which fails to reflect the real world where data items withmultiple true values widely exist. In this paper, we propose a novel approach,SourceVote, to estimate value veracity for multi-valued data items.SourceVote models the endorsement relations among sources by quantifyingtheir two-sided inter-source agreements. In particular, two graphs areconstructed to model inter-source relations. Then two aspects of sourcereliability are derived from these graphs and are used for estimatingvalue veracity and initializing existing data fusion methods. Empiricalstudies on two large real-world datasets demonstrate the effectiveness ofour approach.
Keywords
Data integration, Data fusion, Multi-valued data items, Inter-source agreements
Discipline
Databases and Information Systems | Data Storage Systems
Publication
36th International Conference on Conceptual Modeling, Valencia, Spain, 2017 November 6-9
Identifier
10.1007/978-3-319-69904-2_13
Publisher
Academy of Management
City or Country
Valencia, Spain
Citation
FANG, Xiu Susie; SHENG, Quan Z.; WANG, Xianzhi; BARHAMGI, Mahmoud; YAO, Lina; and NGU, Anne H.H..
SourceVote: Fusing multi-valued data via inter-source agreements. (2017). 36th International Conference on Conceptual Modeling, Valencia, Spain, 2017 November 6-9.
Available at: https://ink.library.smu.edu.sg/sis_research/3857
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/978-3-319-69904-2_13