Publication Type

Working Paper

Version

acceptedVersion

Publication Date

11-2021

Abstract

Data-centric AI calls for better, not just bigger, datasets. As data protection laws with extra-territorial reach proliferate worldwide, ensuring datasets are legal is an increasingly crucial yet overlooked component of “better”. To help dataset builders become more willing and able to navigate this complex legal space, this paper reviews key legal obligations surrounding ML datasets, examines the practical impact of data laws on ML pipelines, and offers a framework for building legal datasets.

Keywords

Legal datasets, machine learning, data laws, data protection laws

Discipline

Computer Law | Databases and Information Systems | Internet Law

Research Areas

Innovation, Technology and the Law

First Page

1

Last Page

7

Copyright Owner and License

Authors

Comments

Accepted at NeuRIPS 2021 Data-Centric AI Workshop

Additional URL

https://arxiv.org/abs/2111.02034

Share

COinS