Publication Type
Book Chapter
Version
publishedVersion
Publication Date
11-2020
Abstract
Availability of big data is crucial for modern machine learning applications and services. Federated learning is an emerging paradigm to unite different data owners for machine learning on massive data sets without worrying about data privacy. Yet data owners may still be reluctant to contribute unless their data sets are fairly valuated and paid. In this work, we adapt Shapley value, a widely used data valuation metric to valuating data providers in federated learning. Prior data valuation schemes for machine learning incur high computation cost because they require training of extra models on all data set combinations. For efficient data valuation, we approximately construct all the models necessary for data valuation using the gradients in training a single model, rather than train an exponential number of models from scratch. On this basis, we devise three methods for efficient contribution index estimation. Evaluations show that our methods accurately approximate the contribution index while notably accelerating its calculation.
Keywords
Federated learning, Data valuation, Incentive mechanism, Shapley value
Discipline
Databases and Information Systems | Data Science
Research Areas
Data Science and Engineering
Publication
Federated Learning: Privacy and Incentive
First Page
139
Last Page
152
ISBN
9783030630751
Identifier
10.1007/978-3-030-63076-8_10
Publisher
Springer
City or Country
Cham
Embargo Period
3-17-2025
Citation
WEI, Shuyue; TONG, Yongxin; ZHOU, Zimu; and SONG, Tianshu.
Efficient and fair data valuation for horizontal federated learning. (2020). Federated Learning: Privacy and Incentive. 139-152.
Available at: https://ink.library.smu.edu.sg/sis_research/10125
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1007/978-3-030-63076-8_10