Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
8-2021
Abstract
Data is one of the most critical resources in the AI Era. While substantial research has been dedicated to training machine learning models using various types of data, much less efforts have been invested in the exploration of assessing and governing data assets in end-to-end processes of machine learning and data science, that is, the pipeline where data is collected and processed, and then machine learning models are produced, requested, deployed, shared and evolved. To provide a state-of-the-art overall picture of this important and novel area and advocate the related research and development, we present a tutorial addressing two essential problems. First, in the pipeline of machine learning, how can data and machine learning models be priced properly so that contributions from various parties can be assessed and recognized in a fair manner? Second, in the collaboration among many parties in building, distributing and sharing machine learning models, how can data as assets be managed? Accordingly, the first part of our proposal surveys data and model pricing in the pipeline of machine learning, while the second part discusses data asset governance for collaborative artificial intelligence. Each part is self-contained. At the same time, the two parts echo each other and connect a series of interesting and important problems into a dynamic big picture.
Keywords
Data asset, Data pricing, Data governance, Consensus, Blockchain, Privacy, Federated learning
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 14-18
First Page
4058
Last Page
4059
Identifier
10.1145/3447548.3470818
City or Country
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
Citation
PEI, Jian; ZHU, Feida; CONG, Zicun; XUAN, Luo; HUIWEN, Liu; and MU, Xin.
Data pricing and data asset governance in the AI Era. (2021). Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 14-18. 4058-4059.
Available at: https://ink.library.smu.edu.sg/sis_research/6903
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.