Duties for datasets

Publication Type

Book Chapter

Publication Date

12-2023

Abstract

Machine learning (ML) systems are increasingly being deployed in contexts, such as law, medicine and finance, where system errors present serious and foreseeable risks. As ML system behaviour is largely determined by their training inputs, should dataset providers owe duties of care to victims? Using the ImageNet dataset and the Generative Pre-trained Transformer (GPT) models as case studies, this chapter argues that the conventional approach of centralising duties on system providers alone yields insufficient safeguards. Dataset-specific duties should also be considered to incentivise precaution in the preparation of crucial ML input. The chapter analyses how dataset duties may be encompassed in existing tort law, surfacing situations where duties are more appropriate. For instance, where a dataset is intended to be used in a risky context, the dataset provider actively influences system outputs, and the dataset is published without safety restrictions or warnings.

Keywords

Datasets, machine learning, tort law

Discipline

Artificial Intelligence and Robotics | Science and Technology Law

Research Areas

Innovation, Technology and the Law

Publication

Data and Private Law

Editor

Damian Clifford, Lau Kwan Ho & Jeannie Marie Paterson

First Page

207

Last Page

224

ISBN

9781509966059

Identifier

10.5040/9781509966059.ch-013

Publisher

Hart Publishing

City or Country

Oxford

Additional URL

https://doi.org/10.5040/9781509966059.ch-013

This document is currently not available here.

Share

COinS